In this video, we are going to review Humata AI, a new service made possible by advancements in artificial intelligence, driven by ChatGPT. Founded in 2022 by entrepreneurs Cyrus Khajvandi and Dan Rasmuson, the project quickly gained traction, securing significant investments from three major companies, one of which has ties to Google venture fund.
Humata AI is essentially a cloud-based tool that uses its own AI model to analyze and summarize documents. It’s like having your own personal ChatGPT for docs. You can ask Humata to do various tasks like finding specific information, analyzing text, extracting certain sections, and it even provides links to the pages it references in the document. If it can’t find what you’re looking for, it suggests searching the internet.
Sounds good, right? We prefer facts, so we started by testing Humata and then proceeded to explore its pricing and technical details.
Testing
- Legal Documents
It’s no secret that lawyers often deal with hefty amounts of text. So, we decided to start our testing of Humata with a 15-page document by the US National Constitution Center. This document describes all 27 amendments to the US Constitution.
We began by checking Humata’s understanding of the document’s content. As you can see, the document highlights four eras of amendment introduction. So we asked Humata how many amendments were adopted during the Progressive era.
Answer: 4. Additionally, it’s worth noting that these were amendments #16 to #19. Also, notice that Humata provided a link to its information source, which takes us to page 7. So, between the first page with a brief description and the seventh with a detailed description, Humata opted for a more detailed source.
Then, we tested how well Humata can process text and asked it to briefly narrate the history of the 27th amendment. This document provides a lot of attention to this.
Answer: In the first sentence, we get a description of the essence of the amendment. So, Humata doesn’t just copy appropriate text for the answer but gives it in a structured form. In the next sentence, the answer is supplemented with additional information about the 200-year wait for the amendment’s adoption.
Next, we tested Humata’s attention to detail. We asked if James Maddson was indeed the author of the 27th amendment but intentionally misspell the name.
Answer: Humata considered this a simple typo and ignored it. However, it correctly identified the legislator’s name in the response.
So we tried another approach. We decided to take a similar-sounding name, for example, Jim Morrison, and ask if he was the author of, say, the 21st amendment.
Answer: Humata couldn’t provide an answer as it didn’t find relevant information in the document.
To test if Humata is consistent in its responses, we asked the same question but introduce a series of grammatical errors.
Answer: It’s the same as before. This is the correct behavior for such language systems because typos are quite common, and systems like these should understand queries even if they’re flawed.
Lastly, a tricky question. “Who was the author of the 21st amendment?”
Answer: Correct, the amendment doesn’t have a specific author. By the way, we want to point out that the description of this amendment in the document intentionally follows the 18th amendment it repealed. Thus, Humata navigates excellently both in the document’s structure and its content elements.
After this we also conducted numerous similar tests with other types of textual documents. There’s no need to go through them again as the results were similar to how Humata behaved when dealing with legal documents. So, we’ll only highlight the most interesting aspects of these tests going forward.
- Medical Documents
For this test, we used a 21-page medical brochure about strokes.
The first question revealed that Humata can formulate answers even when the necessary information isn’t directly stated in the document. As you can see, it used approximately six paragraphs and took the last sentence as the foundation for the answer. Humata also managed to organize information from different parts of the document, consolidating them into one answer.
However, one negative aspect of the language model has surfaced here. There was a page in the document about which doctors’ consultation is required for a patient with a stroke. We formulated a question with two deliberately false options, and Humata chose one of them as the answer, using an external data source as an argument. We checked this source, and the data was correct, but personally, we didn’t like that Humata hadn’t solely relied on the data from my source. Additionally, we had to spend time searching for the external data source used by Humata, as the provided link was already inaccessible.
- Culinary Recipes
In this case, Humata was even more proactive in utilizing external data, but upon refining the question, it was convinced to use only the information provided in the document. Throughout these tests, Humata’s abilities for self-learning and understanding the context of user queries were evident, similar to the popular language models.
- Regular Texts
This document contained several reviews about a single establishment in the USA. There was minimal structuring in the document and informal speech. As you can see, in the first question, Humata exhaustively described the document and its structure, condensing the essence of 20 lines of text into 6.
Additionally, Humata was able to summarize the reviews effectively without losing their essence and, more importantly, provided recommendations according to the request on how the owner should respond to these reviews. As you can observe, the advice given is quite general, but nevertheless, it confirms that besides external data and the user’s document, Humata has an extensive amount of its own data which it can provide to users.
Technical aspects
Since Humata is a cloud-based service, its key technical characteristics are convenience and security. Our tests have shown that Humata has a minimal interface, making it easy to use. Additionally, although it positions itself as a service for PDF documents, it is also compatible with Microsoft Word and PowerPoint files.
It’s also worth noting that our tests have shown that it works best with documents that have clear text formatting. If your PDF file, for example, is an unformatted scanned document, you’ll need to use the OCR function, which is only available in the Team plan. OCR may also be required if Humata detects a large number of diagrams or schematics in your document. However, you can always use third-party OCR solutions to make your documents compatible with Humata. Feel free to ask for assistance in the comments if you need help with this.
As for security, when you send your files, they are encrypted using TLS 1.3 protocol, while saved files are protected with SHA 256-bit encryption algorithm. Additionally, the service is currently undergoing various industry security certifications, including SOC 2 audit – one of the most stringent international cybersecurity standards. In brief, this audit examines all company processes in terms of their impact on the security, availability, integrity, confidentiality, and privacy of user data.
But even with all the certifications and audits, consult with your IT department before uploading confidential data to Humata if you are a business client. If you are a private user, then avoid downloading critical information, as passwords, financial data and ID.
Price
The basis of Humata’s pricing grid is the quota for the number of pages you can scan. In our tests, we used the free plan, which allows scanning of 60 pages per month. The cheapest paid plan is available for students only and costs $2 per month for a quota of 200 pages with the ability to scan above the limit at a price of 2 cents per page.
The basic plan costs $10 for a quota of 500 pages, and in addition to the advantages of the previous plans, it allows you to add three people to the account and access the GPT 4.0 language model. Therefore, although not directly stated in the cheaper plans, it is likely that the GPT 3.5 model is used.
Finally, the premium plan with a quota of 5000 pages, reduced to 1 cent per scan above the limit, and the ability to add up to 10 team members with different levels of access will cost $100 per month per user. It may seem a bit overpriced, but for large companies, this would be an acceptable price for the ability to create an interactive library of work files with built-in OCR feature.
Conclusion
Humata is a product that appears simple on the surface but is quite complex internally. It is compatible with key document formats and can recognize text from images if needed. The AI navigates the document structure, understands the context used in a current situation, and is capable of learning to tailor responses to specific user queries.
Negative moments can be highlighted only by the obsolescence of links to external data sources and the possibility to trap the system in a logical loop when asking a question. However, in the case of Humata, this is not a problem, as the main purpose of this service is document analysis and processing, which it excels at in tests.
The target audience of the service is people working with large textual sources (such as lawyers, doctors or engineers), but thanks to its flexible pricing policy, Humata is perfectly suited for everyday or educational use.
 
                            












