Case Study: The importance of Accurate Data Collection and Data Quality Assurance
We all know how data is really important for any business that relies on selling products or services online but often forget to double-check the accuracy of data we receive. So in the following case study, we will reveal the importance of accurate data collection and quality assurance.
Ready? Let’s dive in — our client is a startup company focused on a single mobile app product promotion.
The initial request from our client layed in configuring all their analytics and visualize the data from various sources in an easy-to-read form. The general process for setting up analytics is usually the following for any project:
- Define business goals and objectives
- Develop a measurement plan (based on business goals)
- Set up analytics tracking
- Quality Assure of settings and troubleshooting
- Implement data visualization
- Provide insights
But, unfortunately, due to being a young, up and coming company, the client had a limited budget and wanted to skip the first 4 steps and go straight to the data visualization. The client assured us he has set up various events in Google Analytics and we can rely on the said data.
When we completed the visualization task, it turned out that the data was not entirely accurate, so we decided to find out what’s missing and begun to dig deeper. It turned out that the difference between the Apple Store and Google Analytics numbers on the same subject was quite astonishing:
- about 10,000 users displayed in Google Analytics and only 3,000 users in the Apple Store.
- about 1600,000 sessions registered in Google Analytics and merely 14,000 sessions registered in the Apple Store
Based on the freshly discovered difference, we made the following conclusions:
- Since a portion of Apple users refuses to share their personal data, we calculated that the real difference in users shouldn’t be more than 50%.
- Even if you eliminate all of the fake users, the number of sessions displayed in Google Analytics remains enormously large. Under this condition, each user should have made about 300 sessions per day, which is simply impossible.
To find a solution to the described issue, we began to research and found that:
- In Google Analytics, there are sessions that are assigned to come from the desktop instead of mobile, which is impossible due to the product being a mobile app.
- When analyzing individual Client IDs, we noticed that many users view the key landing page of the application every half hour and even at night, which is physically impossible for a person using this type of mobile app.
In the course of communication with the client’s developers, we determined that:
- During the process of sending data from the backend, fake sessions with a desktop metric in Google Analytics were created.
- The application works in conjunction with the user’s additional device and is regularly synchronized, receiving data from it. It turned out that this synchronization activates the application and creates fake visits. Meaning even when the user doesn’t actually use the app, he supposedly looks at the main landing page due to application being synchronized with the additional device.
The client’s developers gradually implemented our suggestions and we saw a result that exceeded all expectations. After the first edit and the data stopped from being sent from the backend:
- the number of users reduced by 10%
- the number of sessions got reduced by 63%
After the second edit:
- users decreased by 34%
- sessions were reduced by 82%
The number of Screen Views has also decreased, but the number of Screens / Sessions has increased:
By the time the whole round of settings have been completed, the difference between Google Analytics and the Apple Store is:
- 2 to 1 — users (4000 vs 2000 Apple)
- 7 to 1 — sessions (30000 vs 4000 Apple)
This, of course, is also not an ideal indicator of data compliance, but it is worth considering that:
- Some users prohibit the transfer of information about themselves to Apple
- Session concept is different for Google and Apple
Based on this case, we can’t stress enough how each step of the analytics setup process is of great importance. Data is only useful when it is accurate. You should never neglect to verify the accuracy of the received data and ensure the consistency of values between different sources.
If you rely on erroneous data, you end up making erroneous conclusions. And the issue takes even a more dramatic turn if you are a startup, and while revealing your great results the last thing you’d want is to realize that you were telling lies to investors all along due to analytics being configured incorrectly.
Therefore, when an analytics setup specialist tells you that it is absolutely necessary to verify the data — trust him. He knows what he’s talking about and as we just discovered in the above case study, it is in your best interest 😉
Originally published at https://insightwhale.com on October 22, 2019.