File stores
Business applications are ever changing, and new applications allow the end users to capture data in different formats apart from keying in data (using a keyboard), which are structured in nature.
Another way in which the end users now feed in data is in the form of documents in different formats. Some of the well-known formats are as follows:
- Different document formats (PDF, DOC, XLS, and so on)
- Binary formats
- Image-based formats (JPG, PNG, and so on)
- Audio formats (MP3, RAM, AC3)
- Video formats (MP4, MPEG, MKV)
As you saw in the previous sections, dealing with structured data itself is in question, and now we are bringing in the analysis of unstructured data. But analysis of this data is also as important nowadays as structured ones. By implementing Data lake, we could bring in new technologies surrounding this lake, which will allow us to make some good value out of this unstructured data as well, using the latest and greatest technologies in this space.
Apart from various file formats and data living in it, we have many applications that allow end users to capture a huge amount of data in the form of sentences, which also need analysis. To deal with these comments from end users manually is a Herculean task, and in this modern age, we need to decipher the sentences/comments in an automatic fashion and get a view of their sentiment. Again, there are many such technologies available that can make sense of this data (free flowing text) and help enterprises deal with it in the right fashion.
For example, if we do have a suggestion capturing system in place for an enterprise and (let's say) we have close to 1000 suggestions that we get in a day, because of the nature of the business, it's very hard to get into the filtering of these suggestions. Here, we could use technologies aiding in the sentiment analysis of these comments, and according to the rating these analysis tools provide, perform an initial level of filtering and then hand it over to the human who can understand and make use of it.