Exploring the Landscape of Domain-Specific AI: A Comprehensive Overview

iarbel
Sep 30, 2023
10 min read

Updated: Nov 10, 2023

This is part 1 of our Domain-Specific AI series. In this post, we talk about the different options for tuning and serving AI models in production, and demonstrate outputs from commercial as well as open-source models. Future posts will offer in-depth model comparisons and step-by-step tutorials for constructing a task-specific AI model pipeline.

TL;DR

You can achieve high quality results for domain-specific tasks using AI with minimal setup effort!
Going open-source doesn't mean you have to compromise for quality. In fact, our experiment shows that a fine-tuned model with only 1.3B parameters performs on par with the commercial ChatGPT API, despite being much smaller than it.
Serving your own model is 10x-30x cheaper than using the ChatGPT API!
Open-source also means enhancing privacy and flexibility. However, if you need to prototype fast and iterate, using commercial LLMs can have its benefits.

Ready to add AI to your product? Click here to to get started

Commercial API Solutions
- Understanding Commercial LLMs
- How to Use a Commercial LLM
- Considerations: Privacy, Availability, Pricing, and Flexibility
Open-Source Deployments
- Understanding Open-Source LLMs
- How to Use a Open-Source LLM
- Considerations: Privacy, Availability, Pricing, and Flexibility
Part I Summary

The world of LLMs unlocks many new possibilities for businesses. Ranging from commercial APIs such as OpenAI's GPT and AI21's Jurassic to open-source deployments, there are many options for companies who wish to use AI to enhance their product offering or streamline internal processes. In this series of blog posts, we'll focus on the task of generating appealing e-commerce product descriptions using AI. We'll inspect outputs and customization options from commercial and open-source LLMs and discuss pricing and integrations. Each of the solutions will be analyzed along the following dimensions:

Overview of the proposed solution
How to implement the solution into your product or workflow
Considerations regarding privacy, availability, and pricing

We'll finish with a comprehensive comparison of the quality and performance of the different alternatives, showcasing that lower costs don't mean lower quality. So let's get it started!

Commercial API Solutions: Harnessing the Power of Established Platforms

Commercial API solutions have revolutionized content generation, offering businesses a convenient and reliable way to leverage the power of AI. In this section, we'll explore the advantages and considerations when using commercial API solutions like OpenAI's GPT and AI21's Jurassic. One major benefit of these solutions is their ability to generalize well to multiple scenarios, meaning that they can cover many use cases in different domains with minimal adaptations. By examining these platforms, we aim to provide insights into how businesses can effectively leverage commercial APIs for content generation.

Understanding Commercial LLMs

Language Models have emerged as a remarkable breakthrough, enabling machines to understand and generate human-like text. One prominent class of these models is Large Language Models (LLMs), such as OpenAI's GPT-3.5. LLMs are trained on massive amounts of text data, learning the intricacies of language and context. The training process involves exposing the model to diverse content from books, articles, websites, and more, allowing it to discern patterns, grammar, and even nuances of expression. Commercial APIs, like those offered by OpenAI, allow businesses to harness the power of LLMs without the need for extensive technical knowledge. These APIs expose the model's capabilities through a simple interface, enabling developers to integrate natural language understanding, content generation, language translation, and even chatbot functionalities seamlessly into their applications.

How to use a commercial LLM?

Implementing an LLM API for text generation requires setting up a well-structured pipeline. The process typically involves the following steps:

API Access: First, you need to obtain API access from the provider. Different API providers might have varying methods for authentication, so ensure you follow their guidelines to get your access credentials.
API Integration: Next, you will integrate the API into your application's codebase. Most providers offer SDKs and client libraries in popular programming languages, such as Python, Java or curl, simplifying the integration process.
Input Formatting: To generate text, you pass prompts or queries to the API. Clearly define the task or context in your input to receive relevant responses.
Handling Responses: The API will return the generated text as its response. You can then process and display this text in your application.

Let's focus on the 3rd bullet, input formatting, in order to gain clarity on how we can generate high-quality content outputs. Two prevalent techniques for using LLMs are zero-shot and few-shot learning.

Zero-Shot Generation: In zero-shot generation, you can prompt the model with a query and specify the desired output format. The API will attempt to generate text that fulfills the task without being explicitly trained on it. This approach is incredibly versatile as it enables the model to perform a wide array of tasks without any fine-tuning.
Few-Shot Generation: Few-shot generation allows you to tailor the model's behavior by providing a few examples in addition to your query. These examples act as context, guiding the model's understanding and improving the accuracy of its responses. Few-shot learning strikes a balance between zero-shot flexibility and the specificity of traditional fine-tuning.

The zero-shot approach is easy and straightforward to use. However, since LLMs are trained on a very large corpus and are very general, it's unlikely to produce satisfactory results. Meaning, zero-shot generation will not return outputs that meet the desired style, voice and nuances users expect. This is exactly where few-shot generation comes in handy. The generalization power of LLMs has shown impressive "in-context learning" capabilities, which allows users to control outputs of by providing appropriate examples. Saying that, the downside is that it's not always easy to provide the right context, and both training examples as well as a proper generation pipeline needs to be in-place. Moreover, as LLMs have a limited context window, not all tasks can be completed using few-shot generation (we might not be able to fit in the examples), and this approach also increases costs significantly as the amount of tokens to process increases with every example.

Let's examine zero-shot and 2-shot responses from the GPT-3.5 API, where the model is prompt with a product title and technical details and outputs a product description.

Product Title	YOTFUEL 32-Inch HD Small TV with Built-in HDMI, USB, VGA,Optical and RF 32” LED TV
Tech Details	Screen Size is 32 Inches. Brand is YOTFUEL. Supported Internet Services are Browser. Display Technology is LED. Resolution is 768p. Special Feature is Flat. Connectivity Technology is VGA, Wi-Fi, Ethernet, HDMI. Mounting Type is Table Mount. Item Weight is 11.23 pounds. Special Features are Flat. Standing Screen Display Size is 32 Inches
Zero-Shot	- High-definition 32-inch LED TV with a resolution of 768p for stunning visuals. - Built-in HDMI, USB, VGA, Optical, and RF ports for convenient connectivity options. - Supports internet browsing for easy access to online content. - Sleek and flat design adds a modern touch to any room. - Multiple connectivity options including VGA, Wi-Fi, Ethernet, and HDMI for versatile use. - Easy table mounting for hassle-free installation. - Lightweight design with an item weight of just 11.23 pounds. - Standing screen display size of 32 inches for optimal viewing experience.
2-Shot	- VERSATILE CONNECTIVITY: Our 32-inch LED TV comes with multiple connectivity options, including HDMI, USB, VGA, Optical, and RF. Connect your gaming consoles, DVD players, streaming devices, and more for an immersive viewing experience. - HIGH-QUALITY DISPLAY: Enjoy crisp and clear visuals on the 32-inch screen with a resolution of 768p. The LED display technology provides vibrant colors and sharp details for an enhanced viewing experience. - SMART FUNCTIONALITY: Browse the web, stream your favorite shows, or access online content with the built-in browser. Connect to Wi-Fi or Ethernet for seamless internet access and enjoy a wide range of entertainment options. - EASY SETUP: The table mount design allows for easy setup and placement on any flat surface. Whether it's your living room, bedroom, or office, this TV will fit seamlessly into your space. - SLEEK AND MODERN DESIGN: With its flat and slim profile, this TV adds a touch of elegance to your home decor. The 32-inch screen size is perfect for small to medium-sized rooms, providing an immersive viewing experience without taking up too much space.

We can see that for this use-case, zero-shot outputs are not very good. In practice, this configuration only outputs the technical details that were fed into the model. Since this task requires domain specific knowledge (how Amazon descriptions should be written), we must provide it with contextual examples. However, by using two contextual examples ("2-shot") we were able to get satisfactory results.

Considerations: Privacy, Availability, Pricing, and Flexibility

While commercial APIs present immense potential, certain considerations deserve attention:

Privacy: Using commercial APIs might involve sending data to third-party servers. Ensure you are aware of the data privacy and security policies of the API provider. Be cautious while dealing with sensitive or confidential information. Moreover, some types of data may not be allowed to be shared with third-parties, making the commercial API option irrelevant
Availability: API availability can vary based on server loads and maintenance. Consider the potential impact of API unavailability on your application and have contingency plans in place. Limits also exist on the amount of requests/minute and model throughput.
Pricing: LLM API usage often comes with associated costs. Familiarize yourself with the pricing models of the provider, including any usage limitations, to avoid unexpected expenses. Consider that if you provide some SaaS service, you might need to limit your own users or create custom pricing plans to account for operation costs.
Flexibility and Customizations: The main option to customize outputs from these models relies on prompt tuning (aka prompt engineering) and few-shot prompting discussed earlier. While most commercial LLM vendors offer fine-tuning their models, fine-tuning can be expensive. For example, using a fine-tuned version of GPT-3.5 (ChatGPT) incurs costs that are 80x (!!!) compared to the cost of using out-of-the-box ChatGPT. Moreover, the advanced GPT-4 model doesn't currently have the option to be fine-tuned.

Open-Source Deployments: Customization and Flexibility for Content Generation

Open-source models have emerged as a dynamic alternative for businesses seeking customization and flexibility in text generation. In this section, we will dive into the realm of open-source deployments, exploring models like the celebrated Meta's LLaMA model and other community-driven projects. We'll discuss the advantages and challenges of using open-source models for generating e-commerce product descriptions, such as customization options, technical requirements, and potential limitations. By examining the open-source landscape, businesses can understand the possibilities and trade-offs associated with utilizing these models for content generation.

Unleashing the Potential of Open-Source LLMs for Task-Specific Generation

In the quest for customizable and adaptable content generation solutions, open-source models have emerged as a powerful alternative, providing businesses with the flexibility they need. In this section, we'll delve into the world of open-source deployments, and explore the advantages and challenges of using these open-source LLMs for task-specific text generation, particularly in the context of crafting e-commerce product descriptions. By understanding the landscape of open-source models, businesses can make informed decisions about harnessing their capabilities while considering potential trade-offs.

Understanding Open-Source LLMs

Open-source LLMs are language models that are made freely available by developers and researchers to the community. As is the case with commercial, proprietary model, these models are often pre-trained on vast and diverse datasets, allowing them to grasp language nuances and generate human-like text across various tasks. Open-source deployments provide businesses with the ability to customize and fine-tune these models according to their specific needs, making them valuable assets for task-specific text generation.

One notable example of an open-source LLM is Meta's LLaMA (Language Model for Multiple Applications), which has gained recognition for its remarkable performance across different language tasks. LLaMA brought innovation to the open-source community by releasing a model trained on a massive and comprehensive corpus (1T tokens, ~5x more than models available before). This allowed open-source capabilities which were not seen before, with the massive corpus boosting the capabilities of smaller models which means they can run on consumer hardware and save operational costs in production systems. The recent release of LLaMA-2 further expands these possibilities, with models now licensed for commercial use and trained on 2T tokens.

How to use an open-source LLM?

Here are the steps required to effectively leverage the use of open-source models for your product needs:

Acquiring Model Architecture and Weights: Obtaining the architecture and pre-trained weights of LLMs is remarkably straightforward, thanks to resources like the Hugging Face 🤗 Transformers library. This open-source library provides a treasure trove of pre-trained models, including popular variants like GPT-2, BERT, OPT, T5, LLaMA and more. With just a few lines of code, you can load the desired model and start generating content.
Choosing Between Pre-Trained and Fine-Tuned Models: When it comes to LLMs, you have two primary options: using them as-is or fine-tuning them for specific tasks. For instance, Meta's LLaMA-2 is a massive open-source model that can be employed out-of-the-box, matching the performance of GPT-3.5/4 in some cases. On the other hand, if you have more specialized requirements, smaller models can be fine-tuned to cater to your needs. While fine-tuning demands a bit more effort upfront, it results in a model tailored to your domain, leading to higher-quality outputs. Moreover, smaller fine-tuned models are very cost-effective and scalable in production.
Serving Models in Production: Once your model is in-place, it's time to deploy it in a production environment. This can be achieved through various methods, including setting up servers directly on cloud providers such as AWS or GCP. Alternatively, managed services like Hugging Face Inference Endpoints or Together API offer streamlined solutions for serving your models without the hassle of intricate infrastructure management. These options ensure that your content generation capabilities remain scalable, reliable, and responsive to your users' needs.

Centering our attention on the second bullet point, review a product description response from the OPT model in two distinct contexts: pre-trained and fine-tuned. This particular illustration effectively showcases the efficacy of fine-tuning, a concept we will explore further in our upcoming post.

Pre-Trained Output	Fine-Tuned Output
- HD, Full HD, Ultra HD and 4K - Full HD and Ultra HD for a larger picture - 4K for crystal-clear images - HDMI for a smooth and responsive picture - USB for a simple way to charge your devices - VGA and Ethernet for easier connection - Wi-Fi for Internet browsing - Ethernet for a reliable connection - RF for a powerful picture - Optical for a clear picture - Built-in HDMI for a simple way to charge devices - Optical for a clear picture - RF for a powerful picture - Built-in Ethernet for a simple way to charge devices - HDMI for a simple way to charge devices - Ethernet for a reliable connection - RF for a powerful picture	- FLEXIBLE DISPLAY TECHNOLOGY: The small size of the YOTFUEL HD TV is perfect for any room in your home. It features a full HD resolution (768P) and built-in digital video inputs including HDMI, VGA, Optical and RF. - STUNNING COLOR AND SOUND: Enjoy sharp, vivid images with full-range LED backlight and built-in speakers. The TV has an integrated surround sound system that delivers clear, balanced sound and enhances your viewing experience. - EASY TO USE: The intuitive remote control is easy to use and provides easy access to all of the functions you need. The TV comes with a one-year limited warranty. - RELIABLE SERVICE: Our friendly customer service team will respond to your request within 24 hours and provide excellent service. If you have any questions about our products or our service, please don't hesitate to contact us.

Pre-Trained Output

Fine-Tuned Output

- HD, Full HD, Ultra HD and 4K

- Full HD and Ultra HD for a larger picture

- 4K for crystal-clear images

- HDMI for a smooth and responsive picture

- USB for a simple way to charge your devices

- VGA and Ethernet for easier connection

- Wi-Fi for Internet browsing

- Ethernet for a reliable connection

- RF for a powerful picture

- Optical for a clear picture

- Built-in HDMI for a simple way to charge devices

- Optical for a clear picture

- RF for a powerful picture

- Built-in Ethernet for a simple way to charge devices

- HDMI for a simple way to charge devices

- Ethernet for a reliable connection

- RF for a powerful picture

- FLEXIBLE DISPLAY TECHNOLOGY: The small size of the YOTFUEL HD TV is perfect for any room in your home. It features a full HD resolution (768P) and built-in digital video inputs including HDMI, VGA, Optical and RF.

- STUNNING COLOR AND SOUND: Enjoy sharp, vivid images with full-range LED backlight and built-in speakers. The TV has an integrated surround sound system that delivers clear, balanced sound and enhances your viewing experience.

- EASY TO USE: The intuitive remote control is easy to use and provides easy access to all of the functions you need. The TV comes with a one-year limited warranty.

- RELIABLE SERVICE: Our friendly customer service team will respond to your request within 24 hours and provide excellent service. If you have any questions about our products or our service, please don't hesitate to contact us.

Benefits and Limitations of Open-Source LLMs

Some notable benefits of using an open-source model include:

Customization: Open-source LLMs offer businesses the freedom to fine-tune the models according to their unique requirements. This enables the generation of highly specific and contextually relevant content, such as e-commerce product descriptions tailored to a particular brand or product category.
Flexibility: Developers can modify the model's architecture and adapt it to various tasks, not just limited to text generation. This versatility allows businesses to use the same model for different applications, optimizing resource utilization.
Cost-Effectiveness: Utilizing open-source models can be cost-effective compared to commercial LLM APIs that often come with usage-based pricing. Parameter efficient fine-tuning techniques require very low resources, making the development process cheap and accessible, with GPU costs at less than $1! And once the model is deployed, businesses can freely use them while incurring only operational costs.

However, these models also come across limitations. These include:

Resource Requirements: Fine-tuning open-source models often demands substantial computational resources and time, especially for large-scale applications. Fortunately, parameter efficient fine-tuning (PEFT) techniques require very low resources, democratizing the training process which can run on a single consumer GPU. However, adequate expertise in machine learning are necessary for successful fine-tuning.
Quality and Consistency: While open-source LLMs can produce impressive results, they might not match the performance of large-scale commercial models due to differences in training data and resources. On the contrary, high quality training data and properly implemented fine-tuning pipelines actually allow better control over outputs for specific use-cases.

Considerations: Privacy, Availability, Pricing, and Flexibility

Let's review the privacy, availability, pricing, and flexibility considerations for using open-source models, as we did with the commercial LLMs.

Privacy: One of the key advantages of deploying open-source LLMs is the heightened level of data control and privacy they offer. With these models, your data remains under your control, hosted on servers of your choice. This eliminates the need to send sensitive information to external entities, thus bolstering data security and compliance with privacy regulations.
Availability: When you serve your own private LLM, you are the steward of its availability. This characteristic of self-management can be both empowering and demanding. While this grants you a higher level of control, it's important to acknowledge that ensuring constant availability necessitates dedicated attention. Nevertheless, achieving this is simplified by utilizing an endpoint solution, with plenty of options to choose from.
Pricing: The economic benefits of hosting your own LLM service are substantial. Operating costs are inherently tied to the extent of resources you allocate, with smaller models requiring low compute. As a result, you have the flexibility to scale resources up or down in accordance with real-time demand, optimizing costs and aligning them with your budgetary constraints.
Flexibility and Customizations: Open-source implementations put the control in your hands, allowing you to tailor the service to your unique specifications. You can fine-tune the model's behavior, responses, and outputs to align with your industry's language nuances, branding, or even specific user requirements. However, it's crucial to acknowledge that the efficacy of the generated content is intrinsically linked to the quality of the training data and the extent of expertise you invest in refining the model's behavior.

Part I Summary

In this post, we presented a high-level overview of two prominent methodologies for constructing the AI core of your product or internal tool: commercial models versus open-source deployments. Recent advancements in the field of Large Language Models (LLMs) are opening up new avenues for businesses of all sizes to enhance their product offerings, streamline processes, and boost efficiency. The era in which only well-funded enterprises could make substantial AI-related investments and offer such services is becoming a thing of the past.

Many perceive ChatGPT as a valuable tool for individual purposes, whether personal (e.g., event planning, content summarization) or business-related (e.g., writing SQL queries, information extraction). However, this perspective overlooks the extensive opportunities that the AI revolution is ushering in across various domains. Make no mistake; this is indeed a revolution.

While "ChatGPT" remains the generic term in people's minds when considering AI and LLMs, there are numerous alternative options that businesses can explore, which were discussed in this post. Our upcoming post will delve deeper into the concept of fine-tuning, explaining how it is executed and providing thorough comparisons of various model outputs, along with a detailed cost analysis. We hope this will assist you in selecting the solution that aligns best with your requirements, budget constraints, and regulatory considerations.

Contact today to enhance your products with the power of AI