It is widely known that tracking analytics data can, depending on how and what is tracked, be subject of the GDPR (General Data Protection Regulation) provisions. There are two main privacy concern when tracking analytics data:
TL;DR: Self-hosted analytics are "GDPR-friendly" only if no personal data is being stored or user consent is obtained.
It is important to know what is defined as "personal data" within the GDPR:
...all data which are or can be assigned to a person in any kind of way. For example, the telephone, credit card or personnel number of a person, account data, number plate, appearance, customer number or address are all personal data.
The term is not clearly defined and it is said that personal data should be "as broadly interpreted as possible".
Analytics platforms store aggregated visitors data and/or store data for each individual session (visit time, referrer, actions done on the site, etc.).
Sometimes multiple sessions of the same user are groupped together using an unique user identifier that is generally stored in a cookie on the user's device. This allows the webmaster to gather more in-depth stats such as user retention rate, conversion rate, number of visits and so on. From my understanding of the GDPR, this tracked data can be considered to be personal data if:
If the controller has the legal option to oblige the provider to hand over additional information which enable him to identify the user behind the IP address, this is also personal data. In addition, one must note that personal data need not be objective.
To answer the initial question: analytics platforms CAN store personal data.
It mostly depends on what analytics platforms are being used and how: what specific tracking settings the website owner enables/disables, what event-based data (eg. a purchase receipt) is being collected and what information is anonymized or completely erased.
Yes, self-hosting is much better for user privacy than using a third-party service.
First of all, they better respect Art. 44 of GDPR, stating that you can only transfer data to a third country (outside EU) if that country also enforces data-proteciton regulations:
Any transfer of personal data [...] to a third country or to an international organisation shall take place only if [...] the conditions laid down in this Chapter are complied with by the controller and processor, including for onward transfers of personal data from the third country or an international organisation to another third country or to another international organisation.
Secondly, public analytics services are a point of data centralization: the (personal) data of many users, from multiple organizations, is aggregated and processed. This can lead to (accidentaly or intentionally) mass surveilance, data leaks, unfair business advantage if the data is sold to competitors, monopolies, creating of accurate user personas for ad targeting and much more.
Although the GDPR provisions were created to reduce ad targeting and user data selling/sharing between entities without consent, analytics data are also affected by those regulations.
To comply with those regulations while still being able to use analytics and gather valuable statistics to help you grow your business it is recommended that:
You track no or as little personal data as possible. If any personal data has to be tracked, please request user consent before doing so.
You keep the tracked data secure and do not share it with 3rd parties, especially if those 3rd parties are located outside the EU or if they can not ensure the privacy of the data.
Preferably your analytics data should never leave your organization (to improve user privacy and business competitive advantages).
Self-hosted analytics are "GDPR-friendly" only if no personal data is being stored or user consent is obtained.