Sorry, marketers, the days of free data and unlimited storage have come to an end, especially for those relying on Google Analytics.
In July 2023, Google required users to migrate from their old Universal Analytics (UA or GA3) version to their new Google Analytics 4 (GA4) version.* With this change came a huge learning curve: a new data model, discontinued metrics, a new user interface, and a myriad of other changes to get used to. One additional change that marketers must navigate by the end of 2023 is how to deal with the new limited data retention.
*Hopefully, you have migrated to GA4 by now. If not, reach out and we can help.
Data Retention Settings in Google Analytics
Data retention is the practice of storing data for a certain amount of time. When using a service such as Google Analytics, the data is often stored on the vendor’s server(s) and intended to be accessed frequently or somewhat frequently. Each vendor has their own limit for how long they’re willing to store this data. For example, Facebook Ads retains advertising data for 37 months.
Universal Analytics (UA) Data Retention Settings
IMPORTANT: Google has announced that UA data will no longer be available after July 1, 2024. If you would like to retain data past this date, you will need to have a solution in place prior to this date.
Google used to have options ranging from 14 months to no expiry in UA.
Google Analytics 4 (GA4) Data Retention Settings
Data retention settings have since changed in GA4, and the options now range from 2 months to 50 months depending on if you are using their free version or paying for Google 360.
For those on the free version of Google, the only two options for data retention are 2 months (default) and 14 months. For businesses that have very few seasonal changes, 2 months may be just enough data to analyze month-over-month. Meanwhile, 14-month data retention allows for seasonal businesses to analyze year-over-year data.
To view or change your data retention settings:
- Sign in to your Google Analytics account
- Navigate to the Admin section
- Select the property that you want to view or change
- Click on Data Settings in the Property column
- Click on Data Retention
- Edit/view the Event data retention
BONUS: GA4 gives you the ability to reset the data retention for an individual if Google receives a new event during the data retention period. This means that if someone visits your site within the data retention period, the start date will reset to become the last date that they visited. If they don’t visit the site in the data retention period, then all their information will be deleted.
To view or change these options:
- Sign in to your Google Analytics account
- Navigate to the Admin section
- Select the property that you want to view or change
- Click on Data Settings in the Property column
- Click on Data Retention
- Edit/view the Reset user data on new activity
For most situations, we recommend that user data is reset upon new activity and that the checkbox remains checked.
Key Pieces of Google’s Data Retention Policy
As with many Google products, data retention isn’t black and white. As it stands, GA4 will retain aggregated and custom report data, and the data retention limit won’t apply. The limit will apply when using exploration reports.
To illustrate, let’s assume that you adjusted your data retention to 14 months. Some 24 months after you started your GA4 tracking, you want to view the traffic to your site and, to your surprise, you can see this data even after the 14-month data retention in the reports. This is because Google still allows access to aggregated and custom reporting data past the retention window.
Next, you want to do an in-depth analysis that requires an exploration report, but you don’t have much luck. With exploration reports, or unaggregated data, you see that you can’t access data past your 14-month data retention window.
Simply put, Google’s data retention policy allows users to see aggregated GA4 data indefinitely. This might include things like overall website traffic, pages that are getting the most views, or conversion rates. However, exploration reports will not be available if you want to do things such as track marketing campaign performance, troubleshoot website issues, or identify certain user segments.
IMPORTANT: Exploration reports are a more suitable option for businesses that need to understand their users’ behavior in more detail, process large amounts of data, and/or get highly accurate results. However, GA4’s exploration reports have a data retention maximum of 14 months. Therefore, a solution outside of GA4 would be recommended to meet more detailed business needs.
Some other nuances regarding data retention are:
- The data retention option selected applies to user- and event-level data.
- The data retention option selected applies to conversion data because it is event based.
- If you change the data retention period, it is not retroactive and will only take effect for new data coming in.
- Google signals data and Google signed-in data expire after 26 months or after your data retention window, whichever is shorter.
- Data related to age, gender, and interest are always deleted after 2 months.
Why the Data Retention Change
If you’re wondering why the change in data retention, you don’t need to look much further than the General Data Protection Regulation (GDPR), which went into effect in 2020. The GDPR was a law that the European Union passed to protect an individual’s privacy and allow them more control over their own data. The law focused on data transparency, purpose limitation, minimization, accuracy, storage limitation, confidentiality, and accountability. Essentially, companies can no longer collect and store endless amounts of data on citizens.
Shortly on the heels of the GDPR came the California Consumer Privacy Act (CCPA). This was a law enacted in 2022 in California that mimicked the GDPR but wasn’t as strict. Again, this law was to ensure that individuals have control over the data that companies collect.
Due to these laws and changing sentiment from the public, Google changed their data retention policy when they rolled out GA4. First, they capped the length of time that they would collect data to satisfy the GDPR. Second, they set the default to 2 months of data. This really gets to the heart of limiting storage, minimizing data, and ensuring more accountability.
Accessing Data Past GA4’s Data Retention Period
Unfortunately, once your data reaches the maximum retention window, the unaggregated data is deleted from Google’s server and gone forever. Unless, that is, you preserve the data in some way.
Should you want to preserve your data, below are some of the most common ways.
Google BigQuery (or other cloud-based data warehouse)
The most obvious solution to storing your data is by using Google BigQuery. This is a fully managed data warehouse that is integrated with Google Analytics and requires very little setup. The cost of the solution ranges based on the amount of storage that you use and the cost to process queries.
There are other cloud-based data warehouses that work just as well as BigQuery, such as Snowflake, AWS, Db2, and Azure. Each has its pros and cons with the key advantage of BigQuery being a built-in link and connector to GA4 that maps your data tables from GA4 to BigQuery.
To link to a BigQuery project:
- Sign in to your Google Analytics account
- Navigate to the Admin section
- Select the property that you want to view or change
- Scroll down to BigQuery Links
- Link to an existing BigQuery project/account
On-premises database
An on-premises database might be an option if you have a team of developers. GA4 has an API, which will allow access to data from your account that can then be inserted into a database of your choosing. This is ideal in situations where you want to seamlessly integrate your analytics data with other on-premises data and/or you want to store your data out of the cloud.
Cloud-based ETL service
If you want the ability to retain data in a database, combine it with other datasets, and don’t have an IT team to lean on, you might want to go the route of an ETL. ETLs are tools that extract, transform, and load data, and an ETL company has connectors that can integrate many different sources of data into one place. In this case, the ETL would connect to your GA4 account via an API and store in a location of your choosing (your database, their database, a Google Sheet, etc.). Some companies that provide this type of service are Supermetrics, Improvado, Funnel, Stitch, and Fivetran.
Excel
This might sound like an archaic solution to data retention, but it does work. If you know the data that you need, and it is minimal, then downloading the necessary reports from an exploration report and saving them to an excel file at defined intervals could be a viable option.
Things to Keep in Mind When Making Your Decision
When determining the best solution for data retention, it is important to keep in mind how data collection is shifting in terms of governance, public opinion, and legality. The shift has been to provide the individual with more control, and this should also impact how you collect and store data.
Yesterday’s Mindset
Mantra: Collect and store as much data as possible just in case it is needed in the future.
- Data Collection: decentralized and sporadic
- Data retention: unlimited or lengthy
- Data destruction: unorganized and neglected
Today’s Mindest
Mantra: Collect only necessary non-personally identifiable information and store only as long as needed.
- Data collection: centralized and strategic
- Data retention: limited and minimal
- Data destruction: organized and secure
So, some key things to consider are:
- Minimization: you should only retain data that you need. For example, you may only need to know the channels and not the cities of the users.
- Storage limitation: you should only retain data for as long as you need. The digital environment is changing so fast that it may no longer be practical for your business to compare what happened yesterday to what happened four years ago. Once storage limitation is reached, some consideration and policies should be in place to handle data destruction, which covers when and what data should be destroyed and the best practices to achieve that.
- Transparency, accountability, and confidentiality: if you are retaining data related to an individual, you will need to make sure that you are doing so in a secure manner. You will also want to consider the process should an individual request that their data be deleted.
Next Steps and How Can We Help
Google’s GA4 data retention policy came about likely due to GDPR. Some companies may find that relying on 14 months of data is sufficient for their needs. Meanwhile, others may find they need or want data for a longer period. If you find yourself in the latter, you will need a solution to preserve your data.
If you find that you need help, let us know. We can help if you:
- Are starting from square one and haven’t completed the UA to GA4 migration.
- Are unsure of the retention period that fits your needs.
- Know that you need a longer data retention period, and you’re not sure where to start.
- Know that you need a longer data retention period but don’t know the best technologies and tools to use.
- Know that you need a longer data retention period but don’t have the time.
Subscribe to our newsletter
Get our insights and perspectives delivered to your inbox.