What if UX is the new best friend of data intensive application?

I'm spending several hours per days reviewing applications that rely on a huge amount of data. They provide insight to customers, simulate...
What if UX is the new best friend of data intensive application?
I'm spending several hours per days reviewing applications that rely on a huge amount of data. They provide insight to customers, simulate hypothesis or just help to look for new business opportunities.
More and more, to explore data, software engineers leverage web based user interfaces to build applications that intensively consume data. Those, supported by a data pipeline that prepare the data in advance to serve customer need efficiently.

With this architecture, you're able to feed an application with aggregated data based on large volume of data. You will not suffer from latency or issues with memory management as the data will be pre-computed and stored in a database. This is made possible if you're able to anticipate the queries that will be performed by the user.
The word anticipation is important. It basically means that data need to be prepared to fit user request, and those workflows are known in advance. For example, on a monthly revenue for the twenty first best products or on a list of the ten next maintenance to prioritized on a car the user workflow is perfectly managed. This situation is ideal as you are able to prepare exactly what the user want to display. There will be few questions about refresh time and number of users at the same time but honestly, this is not a big deal.
Things are going to be more complex at the time when the user will be able to generate query on his own. Imagine that the user want now to request not the tenth first rows but dynamically expand the query to thousands. Imagine also that the product aggregation is too large and now, we need to exclude or filter on specific products.
Actually, the more you will add features to filter and aggregate from the application then the more anticipating all possible queries will be a complex job. At first, you were having a responsive and efficient system, but now you need to trade-off between performance, database costs and even the customer satisfaction directly. When you're there, a step back is necessary to think about the options that you have before building the impossible.
- The first one will be analyzed in an other post, but it basically involve an architecture-switch to leverage an asynchronous system to send back your query to your distributed system and secure a result while assuming a higher response latency.
- The second one is to work directly on the user needs to tune or hack its actual experience to facilitate anticipation when responding to user queries.
Finding a path between the anticipation needs of engineers and the gluttony of users for data involves looking at the user experience. You need to think about user interface, user needs (or what they seems too) and the workflow they want to use.
Obviously as we're dealing with user feelings and this post have is not about the data itself there are no silver bullet solutions.
But here are the points I'm trying to optimize or look for in priority when I face such issue.
Access patterns
How does the user try to access the data? Does it use a direct reference? Do we need to stick to the common pattern of having lists of objects prior to accessing the searched one? Navigating the data and application performance or intimately linked. Whenever possible try to reduce navigation to the minimum.
I've met many users who know well the reference of their data, business users with experience have deal with excel files for years, and probably build methodologies or trick to find data faster. Leveraging this experience in accessing the data instead of building a traditional web navigation can help you to save compute and time by avoiding useless queries. Imagine that instead of building a Search List Proposition Object Result you create a Business Object Reference Object result.
This approach is interesting also on engineering side as it will help them to anticipate which field are crucial to build the correct index and cache that will improve response delay. You can now avoid preparing the whole data table and just be sure that a few subset of data is enough to fit the need. For the example above you can even avoid building list of objects, pagination, search orders.
One complex object at a time instead of list
You shouldn't ever accept to load data that is not directly displayed to the user. It will cost a lot in term of response delay and in term of computing resource while being perfectly useless for the user. For example, let's take a query that perform a compute intensive aggregation on several keys on your database. There are two scenarios:
- User provide the scope, the query is processed, it takes time then it's available for the user and it ends up with the user to scroll to find the few keys that are really interesting.
- The user anticipate the key that he's looking for and then process the query for this unique object. Then by clicking on a forward reference he can jump on the other one reprocessing a new query.
Focusing on one object at a time is a good way to save time and processing while offering engineer a good anticipation compared to the real volume of data. In the first case you can deal with aggregating over a stack of object that is continuously growing (I want 100 objects, 1000 ...), in the second one you help the user to focus and provide only the information he's looking for.
Filter, filter, filter.
You got the trick, the idea is to work on the user experience to find all the path possible to avoid requesting data that will not be used. But when you provide abilities to generate a query you always need to deal with the risk of performance / experience issues.
In fact, instead of wasting time to look for the data they need in a bunch of rows, power users should build the perfectly fitted query they need. It could be direct access, a set of filters or even a domain specific language to query efficiently the data. When the user share the "rules", engineering job can start to prepare datasets and views that will fit those rules leveraging the right indexes or preparing expensive aggregates.
Wrap up
To reduce continuously the risk of waste you need to closely work with users to have a comprehensive understanding of the data usage. It will be easier to avoid all the common experience issues that you find when you build product that fits industry standard instead of building a fitted experience.
Following the role you take in building such application you need to always keep in mind that their are multiple ways to achieve the same goal when building such products . Thinking about hacking your current perception of user needs can provide great benefit for the user experience directly, but also on application performance and definitely can help you to reduce the computing waste and user frustration.
