By Nicole Hitner, Exago
Picking through data is a natural and necessary part of reporting. Wading through data isn’t.
Organizations see diminishing returns on their business intelligence efforts when they present users with an overabundance of data. Report authors spend too much time looking for relevant information, and BI systems expend too much processing power on querying and displaying data. The problem often results in frustrated business users made more frustrated by sluggish systems.
But how much data is too much? This is less about hitting a target number of tables or records and more about considering what information is relevant to whom. Segmenting data sources so that individuals can easily locate pertinent information is critical, particularly with very large data sets.
Your organization might be struggling under the weight of its data if:
- BI users grow frustrated wading through irrelevant information.
- Your BI solution exposes data fields that no one ever uses.
Regarding the first symptom, it’s important to distinguish between confusingly labeled data and irrelevant data. If a user cannot understand whether a field or table is relevant to them, they are experiencing a data documentation problem rather than a volume problem. These two issues are often related but are not the same thing. A database can be well documented but still contain large quantities of unhelpful or irrelevant data.
Exposing irrelevant data fields also leads to less efficient data processing. In many cases, extraneous fields exposed to BI users will end up involved in table joining and filtering, resulting in unnecessary slowdowns.
Just because a field exists in your raw, transactional database does not mean it should be in your reporting database. If business users are reporting directly off transactional data, this may explain why they are wading through unhelpful, irrelevant fields. Evaluate all key, id, metadata, and code fields before adding them to your BI data warehouse. Also, be sure to sufficiently denormalize your transactional data for maximally efficient querying.
Lack Of User Research
It takes time and effort for organizations to discover what data their BI users value. It is very possible for non-id and non-key fields to be extraneous, and administrators won’t know which fields these are without some inquiry. Admins can use activity logging, surveys, and interviews to get a clearer sense of users’ data needs.
Poor Signposting Or Permissioning
Ideally, BI uses would be able to find the data most relevant to them either because it’s the only data they have permission to access or because it’s clearly labeled using terminology they understand. A sales representative might look for relevant data in a folder marked “Sales,” for example, but be confused if the only folders available were marked “MSSQL” and “Postgres.” Thoughtful signposting, combined with permissioning, will help funnel users toward the data they’re most likely to be interested in.
An overabundance of data can lead to frustrated BI users, reporting errors, and slow report queries. All of these contribute to delayed business insights. Left unaddressed, data overwhelm can result in distrust of, and distaste for, the BI system as a whole. Rather than trouble with something slow and confusing, business users might rely on spreadsheets (which become data silos) or IT (which can become a bottleneck). In this way, too much data can result in low returns on your BI endeavors.
Organizations can prevent data overwhelm by only exposing relevant information and making a concerted effort to guide BI users toward the data subsets that concern them. Consider taking these steps:
- Eliminate irrelevant fields during data warehousing. Look to remove fields that facilitate raw data storage or are only machine-readable. Monitor BI usage and adjust your warehousing practices as needed.
- Segment data into clearly labeled subject domains. Properly group and alias tables, models, fields, and any other data elements with which BI users may interact. Once again, monitor usage and user feedback for opportunities to improve this signposting.
- Archive. Protect your database servers (and users’ experiences) by archiving old records. Whether you gate this archive behind a permissions wall or leave it exposed, having it separate from more current records will help prevent data overwhelm and unruly report executions.
- Tenant. Restrict users to certain data sets until their organizational roles or responsibilities require access to other sets. Make sure BI users know who to contact with these requests.
- Document. Even manageable volumes of relevant data can be confusing. Organizations can help prevent the feeling of data overwhelm by providing data dictionaries and business glossaries as supplemental guides.
Follow these tips, and not only will your BI users feel more at home in their reporting applications, but the applications themselves will be more efficient.
Do you have any tips for preventing data overwhelm? Let us know in the comments!
About The Author
Nicole Hitner is Content Strategist at Exago, Inc., producers of embedded business intelligence for software companies. She manages the company’s content marketing, writes for their blog, hosts their podcast Data Talks, and assists the product design team in continuing to enhance Exago BI. You can reach her at firstname.lastname@example.org.