Close Menu
The AI Book
    Facebook X (Twitter) Instagram
    The AI BookThe AI Book
    • Home
    • Categories
      • AI Media Processing
      • AI Language processing (NLP)
      • AI Marketing
      • AI Business Applications
    • Guides
    • Contact
    Subscribe
    Facebook X (Twitter) Instagram
    The AI Book
    Daily AI News

    TruEra launches free tool for testing LLM apps for hallucinations

    25 May 2023No Comments4 Mins Read

    [ad_1]

    Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More


    TruEra, a vendor providing tools to test, debug and monitor machine language (ML) models, today expanded its product portfolio with the launch of TruLens, open-source software dedicated to testing applications built on large language models (LLMs) like the GPT series.

    Available starting today for free, TruLens provides enterprises with a quick and easy way to evaluate and iterate on their LLM applications and eliminate the chances of hallucination and bias in the production stage.

    Currently, only a limited number of vendors offer tools to tackle this aspect of LLM app development, even as enterprises across sectors continue to explore the potential of generative AI for different use cases.

    Why TruLens for LLM applications?

    LLMs are all the rage, but when it comes to building applications based on these models, companies have to go through a tiring experimentation process that involves human-driven response scoring. Essentially, once the first version of an app is developed, teams have to manually test and review its answers, adjust prompts, hyperparameters and models, and then re-test over and over until a satisfactory result is achieved.

    Event

    Transform 2023

    Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.

     

    Register Now

    This not only takes a lot of time but is difficult to scale up.

    With TruLens, TruEra is addressing this gap by introducing a programmatic method of evaluation called “feedback functions.” As the company explains, a feedback function scores the output of an LLM application for quality and efficacy by analyzing both the text generated from the LLM and the response’s metadata.

    “Think of it as a way to log and assess direct and indirect feedback about the performance and quality of your LLM app. This helps developers to create credible and powerful LLM apps faster. You can use it for a wide variety of LLM use cases, like chatbot question answering, information retrieval and so on,” Anupam Datta, cofounder, president and chief scientist at TruEra, told VentureBeat.

    TrueLens for LLMs
    TruLens for LLMs: How it works

    TruLens can be added to the development process with a few lines of code. Once it’s up and running, users can create their own feedback functions — customized to specific use cases — or use the out-of-the-box options. 

    Currently, the software provides feedback functions that test for truthfulness, question-answering relevance, harmful or toxic language, user sentiment, language mismatch, response verbosity, and fairness and bias. Moreover, it also logs how much an LLM is being pinged within the app, giving an easy way to track usage costs.

    “This helps you to also determine how to build the best version of the app at the lowest ongoing cost. All of those pings add up,” Datta noted.

    Other offerings for LLM applications

    While testing LLM-driven applications for performance and response accuracy is the need of the hour, only a handful of players have launched solutions to deal with it. These include Datadog’s OpenAI model monitoring integration, Arize’s Pheonix solution, and Israel-based Mona Labs’ just-launched generative AI monitoring solution.

    TruEra, for its part, claims that TruLens is best used in the development phase of LLM app development. 

    “This is actually the phase that most companies are in today — they are experimenting with development and really have an acute need for tools to help them iterate faster and home in on application versions that are both effective at their tasks and risk-minimizing. You can, of course, use it on both development and production models,” Datta said.

    According to an Accenture survey, 98% of global executives agree that AI foundation models will play an important role in their organizations’ strategies in the next three to five years. This signals that tools like TruLens will soon see increased demand from enterprises.

    VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

    [ad_2]

    Source link

    Previous ArticleHere’s what’s missing from the White House’s National AI Strategy
    Next Article The Security Hole at the Heart of ChatGPT and Bing
    The AI Book

    Related Posts

    Daily AI News

    Adobe Previews New GenAI Tools for Video Workflows

    16 April 2024
    Daily AI News

    Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

    15 April 2024
    Daily AI News

    8 Reasons to Make the Switch

    15 April 2024
    Add A Comment
    Leave A Reply Cancel Reply

    • Privacy Policy
    • Terms and Conditions
    • About Us
    • Contact Form
    © 2026 The AI Book.

    Type above and press Enter to search. Press Esc to cancel.