在这个快速入门中,我们将向您展示如何:

  • 设置LangChain、LangSmith和LangServe
  • 使用LangChain的最基本和常见的组件:提示模板、模型和输出解析器
  • 使用LangChain表达语言,这是LangChain构建的协议,也是组件链接的基础
  • 使用LangChain构建一个简单的应用程序
  • 使用LangSmith跟踪您的应用程序
  • 使用LangServe为您的应用程序提供服务

LangChain使其能够将外部数据源和计算与LLM连接起来。 在这个快速入门中,我们将介绍一些不同的方法来实现这一点。

  • 1 我们从一个简单的LLM开始,依靠大模型内部的知识进行回复 — Model I/O
  • 2 我们将从一个简单的LLM链开始,它只依赖于提示模板中的信息来回复。— Model I/O + Chain
  • 3 接下来,我们将构建一个检索链,该链从单独的数据库获取数据并将其传递到提示模板中。 — Model I/O + Retrievel + Chain
  • 4 然后,我们将添加聊天记录,以创建一个对话检索链。这使您可以以聊天方式与此LLM进行交互,因此它会记住以前的问题。— Model I/O + Retrievel + Chain + Memory
  • 5 最后,我们将构建一个代理,该代理利用LLM来确定是否需要获取数据来回答问题。 我们将简要介绍这些内容,但是所有这些都有很多细节!我们会链接到相关的文档。 — Model I/O + Retrievel + Chain + Memory + Agent

1 LLM

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
llm = ChatOpenAI(model="deepseek-chat", api_key=DEEPSEEK_API_KEY, base_url=BASE_URL)


res = llm.invoke("how can langsmith help with testing?")
print(res.content)

LLM回复如下:

LangSmith is a tool designed to help developers and data scientists streamline the process of testing and evaluating language models. Here are some ways LangSmith can assist with testing:

1. **Automated Testing**: LangSmith can automate the testing process by running predefined test cases against your language models. This includes unit tests, integration tests, and system tests to ensure that the model behaves as expected across various scenarios.

2. **Performance Metrics**: It provides detailed performance metrics such as accuracy, precision, recall, and F1-score. These metrics help in understanding how well the model is performing and where it might be falling short.

3. **Error Analysis**: LangSmith allows for detailed error analysis. By identifying and categorizing errors, developers can focus on improving specific areas of the model that are causing issues.

4. **Regression Testing**: As models are updated and improved, LangSmith can help in performing regression testing to ensure that new changes do not negatively impact existing functionality.

5. **Continuous Integration (CI)**: LangSmith can be integrated into CI pipelines, allowing for automated testing with every code commit or pull request. This ensures that the model remains stable and performs well even as it evolves.

6. **User Feedback Integration**: It can integrate user feedback into the testing loop. This means that real-world usage data can be used to continuously improve and refine the model.

7. **Customizable Test Suites**: Developers can create customizable test suites tailored to their specific needs and use cases. This flexibility ensures that the testing process is comprehensive and relevant.

8. **Collaboration Tools**: LangSmith often includes collaboration tools that allow teams to work together on testing efforts. This can include shared dashboards, reporting tools, and communication channels.

9. **Scalability**: It can handle testing at scale, allowing for the evaluation of models across large datasets and diverse user inputs.

10. **Compliance and Standards**: LangSmith can help ensure that language models meet industry standards and compliance requirements, which is crucial for applications in regulated industries.

By providing these features, LangSmith can significantly enhance the efficiency and effectiveness of testing language models, leading to more reliable and robust AI systems.

2 LLM Chain

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

llm = ChatOpenAI(model="deepseek-chat", api_key=DEEPSEEK_API_KEY, base_url=BASE_URL)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are world class technical documentation writer."),
    ("user", "{input}")
])

parser = StrOutputParser()

chain = prompt | llm | parser

res = chain.invoke({"input": "how can langsmith help with testing?"})
print(res)

ChatLLM 回复如下:

LangSmith is a platform designed to help developers and data scientists build, test, and monitor language models. It provides a suite of tools that can be particularly useful for testing the performance and reliability of language models. Here's how LangSmith can help with testing:

1. **Automated Testing**: LangSmith allows you to create automated test suites for your language models. You can define test cases that evaluate the model's responses against expected outcomes, helping you catch issues early in the development cycle.

2. **Performance Monitoring**: The platform provides real-time monitoring of model performance, including response times, error rates, and other key metrics. This helps you identify bottlenecks and areas for improvement during testing.

3. **Data Annotation and Evaluation**: LangSmith offers tools for annotating and evaluating model outputs. You can use these tools to manually review and score model responses, ensuring they meet your quality standards.

4. **Integration with CI/CD Pipelines**: LangSmith can be integrated with continuous integration and continuous deployment (CI/CD) pipelines, allowing you to automate the testing process as part of your development workflow. This ensures that every update to your model is thoroughly tested before deployment.

5. **Model Comparison**: The platform enables you to compare the performance of different models or different versions of the same model. This helps you make informed decisions about which model to deploy based on test results.

6. **Feedback Loop**: LangSmith supports a feedback loop where you can collect user feedback on model outputs and use this data to refine your tests and improve model performance over time.

7. **Scalability**: Testing language models at scale can be challenging, but LangSmith provides the infrastructure to handle large volumes of test cases efficiently, ensuring comprehensive coverage.

8. **Collaboration**: The platform facilitates collaboration among team members by providing a centralized location for sharing test results, annotations, and other relevant data. This helps in maintaining consistency and quality across different stages of testing.

By leveraging these features, LangSmith can significantly enhance the testing process for language models, leading to more reliable and high-performing AI applications.

3 LLM Retrieval Chain

Retrieval模块:数据加载,数据切分,embedding encode,向量数据库,检索器;

import warnings
warnings.filterwarnings(action="ignore")
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain

llm = ChatOpenAI(model="deepseek-chat", api_key=DEEPSEEK_API_KEY, base_url=BASE_URL)

prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:


{context}


Question: {input}""")

# data source and split
loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter()
documents_split = text_splitter.split_documents(documents)

embeddings = HuggingFaceEmbeddings(model_name=r"F:\Bert")
vector = FAISS.from_documents(documents_split, embeddings)

document_chain = create_stuff_documents_chain(llm, prompt)
retrieval_chain = create_retrieval_chain(vector.as_retriever(), document_chain)

response = retrieval_chain.invoke({"input": "how can langsmith help with testing?"})
print(response["answer"])

LLM Retrieval Chain 回复如下:

LangSmith can help with testing in several ways:

1. **Initial Test Set**: LangSmith allows developers to create datasets, which are collections of inputs and reference outputs, and use these to run tests on their LLM applications. These test cases can be uploaded in bulk, created on the fly, or exported from application traces. LangSmith also makes it easy to run custom evaluations (both LLM and heuristic based) to score test results.

2. **Comparison View**: When prototyping different versions of your applications and making changes, LangSmith provides a user-friendly comparison view for test runs to track and diagnose regressions in test scores across multiple revisions of your application.

3. **Playground**: LangSmith provides a playground environment for rapid iteration and experimentation, allowing you to quickly test out different prompts and models. Every playground run is logged in the system and can be used to create test cases or compare with other runs.

4. **Beta Testing**: LangSmith supports beta testing by allowing developers to collect data on how their LLM applications are performing in real-world scenarios. It facilitates feedback collection and run annotation, which are critical for this workflow.

5. **Capturing Feedback**: LangSmith allows you to attach feedback scores to logged traces, which helps draw attention to interesting runs and highlight edge cases that are causing problematic responses.

6. **Annotating Traces**: LangSmith supports sending runs to annotation queues, allowing annotators to closely inspect interesting traces and annotate them with respect to different criteria.

7. **Adding Runs to a Dataset**: LangSmith enables you to add runs as examples to datasets, expanding your test coverage on real-world scenarios.

8. **Monitoring and A/B Testing**: LangSmith provides monitoring charts that allow you to track key metrics over time and supports tag and metadata grouping for A/B testing changes in prompt, model, or retrieval strategy.

9. **Automations**: LangSmith's automations feature allows you to perform actions on traces in near real-time, such as automatically scoring traces, sending them to annotation queues, or sending them to datasets.

10. **Threads**: LangSmith provides a threads view that groups traces from a single conversation together, making it easier to track the performance of your application across multiple turns.

4 LLM Retrieval Chain with Memory

import warnings
warnings.filterwarnings(action="ignore")
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain
from langchain.chains.history_aware_retriever import create_history_aware_retriever
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.prompts import MessagesPlaceholder

llm = ChatOpenAI(model="deepseek-chat", api_key=DEEPSEEK_API_KEY, base_url=BASE_URL)

prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
    ("user", "根据上面的对话,生成一个搜索查询来获取与对话相关的信息")
])

# data source and split
loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter()
documents_split = text_splitter.split_documents(documents)

embeddings = HuggingFaceEmbeddings(model_name=r"F:\Bert")
vector = FAISS.from_documents(documents_split, embeddings)
retriever = vector.as_retriever()

retriever_chain = create_history_aware_retriever(llm, retriever, prompt)
prompt = ChatPromptTemplate.from_messages([
    ("system", "根据下面的上下文回答用户的问题:\n\n{context}"),
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
])
document_chain = create_stuff_documents_chain(llm, prompt)
retriever_chain = create_retrieval_chain(retriever_chain, document_chain)

chat_history = [HumanMessage(content="Can LangSmith help test my LLM applications?"), AIMessage(content="Yes!")]
response = retriever_chain.invoke({"chat_history": chat_history, "input": "Tell me how"})
print(response)

LLM Retrieval Chain with Memory回复如下:

{'chat_history': [HumanMessage(content='Can LangSmith help test my LLM applications?'), AIMessage(content='Yes!')], 'input': 'Tell me how', 'context': [Document(metadata={'source': 'https://docs.smith.langchain.com/user_guide', 'title': 'LangSmith User Guide | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.', 'language': 'en'}, page_content='meaning that they involve a series of interactions between the user and the application. LangSmith provides a threads view that groups traces from a single conversation together, making it easier to track the performance of and annotate your application across multiple turns.Was this page helpful?You can leave detailed feedback on GitHub.PreviousQuick StartNextOverviewPrototypingBeta TestingProductionCommunityDiscordTwitterGitHubDocs CodeLangSmith SDKPythonJS/TSMoreHomepageBlogLangChain Python DocsLangChain JS/TS DocsCopyright © 2024 LangChain, Inc.'), Document(metadata={'source': 'https://docs.smith.langchain.com/user_guide', 'title': 'LangSmith User Guide | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.', 'language': 'en'}, page_content='Skip to main contentGo to API DocsSearchRegionUSEUGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookThis is outdated documentation for 🦜️🛠️ LangSmith, which is no longer actively maintained.For up-to-date documentation, see the latest version.User GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping\u200bPrototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.\nThe ability to rapidly understand how the model is performing — and debug where it is failing — is incredibly important for this phase.Debugging\u200bWhen developing new LLM applications, we suggest having LangSmith tracing enabled by default.\nOftentimes, it isn’t necessary to look at every single trace. However, when things go wrong (an unexpected end result, infinite agent loop, slower than expected execution, higher than expected token usage), it’s extremely helpful to debug by looking through the application traces. LangSmith gives clear visibility and debugging information at each step of an LLM sequence, making it much easier to identify and root-cause issues.\nWe provide native rendering of chat messages, functions, and retrieve documents.Initial Test Set\u200bWhile many developers still ship an initial version of their application based on “vibe checks”, we’ve seen an increasing number of engineering teams start to adopt a more test driven approach. LangSmith allows developers to create datasets, which are collections of inputs and reference outputs, and use these to run tests on their LLM applications.\nThese test cases can be uploaded in bulk, created on the fly, or exported from application traces. LangSmith also makes it easy to run custom evaluations (both LLM and heuristic based) to score test results.Comparison View\u200bWhen prototyping different versions of your applications and making changes, it’s important to see whether or not you’ve regressed with respect to your initial test cases.\nOftentimes, changes in the prompt, retrieval strategy, or model choice can have huge implications in responses produced by your application.\nIn order to get a sense for which variant is performing better, it’s useful to be able to view results for different configurations on the same datapoints side-by-side. We’ve invested heavily in a user-friendly comparison view for test runs to track and diagnose regressions in test scores across multiple revisions of your application.Playground\u200bLangSmith provides a playground environment for rapid iteration and experimentation.\nThis allows you to quickly test out different prompts and models. You can open the playground from any prompt or model run in your trace.'), Document(metadata={'source': 'https://docs.smith.langchain.com/user_guide', 'title': 'LangSmith User Guide | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.', 'language': 'en'}, page_content="Every playground run is logged in the system and can be used to create test cases or compare with other runs.Beta Testing\u200bBeta testing allows developers to collect more data on how their LLM applications are performing in real-world scenarios. In this phase, it’s important to develop an understanding for the types of inputs the app is performing well or poorly on and how exactly it’s breaking down in those cases. Both feedback collection and run annotation are critical for this workflow. This will help in curation of test cases that can help track regressions/improvements and development of automatic evaluations.Capturing Feedback\u200bWhen launching your application to an initial set of users, it’s important to gather human feedback on the responses it’s producing. This helps draw attention to the most interesting runs and highlight edge cases that are causing problematic responses. LangSmith allows you to attach feedback scores to logged traces (oftentimes, this is hooked up to a feedback button in your app), then filter on traces that have a specific feedback tag and score. A common workflow is to filter on traces that receive a poor user feedback score, then drill down into problematic points using the detailed trace view.Annotating Traces\u200bLangSmith also supports sending runs to annotation queues, which allow annotators to closely inspect interesting traces and annotate them with respect to different criteria. Annotators can be PMs, engineers, or even subject matter experts. This allows users to catch regressions across important evaluation criteria.Adding Runs to a Dataset\u200bAs your application progresses through the beta testing phase, it's essential to continue collecting data to refine and improve its performance. LangSmith enables you to add runs as examples to datasets (from both the project page and within an annotation queue), expanding your test coverage on real-world scenarios. This is a key benefit in having your logging system and your evaluation/testing system in the same platform.Production\u200bClosely inspecting key data points, growing benchmarking datasets, annotating traces, and drilling down into important data in trace view are workflows you’ll also want to do once your app hits production.However, especially at the production stage, it’s crucial to get a high-level overview of application performance with respect to latency, cost, and feedback scores. This ensures that it's delivering desirable results at scale.Online evaluations and automations allow you to process and score production traces in near real-time.Additionally, threads provide a seamless way to group traces from a single conversation, making it easier to track the performance of your application across multiple turns.Monitoring and A/B Testing\u200bLangSmith provides monitoring charts that allow you to track key metrics over time. You can expand to view metrics for a given period and drill down into a specific data point to get a trace table for that time period — this is especially handy for debugging production issues.LangSmith also allows for tag and metadata grouping, which allows users to mark different versions of their applications with different identifiers and view how they are performing side-by-side within each chart. This is helpful for A/B testing changes in prompt, model, or retrieval strategy.Automations\u200bAutomations are a powerful feature in LangSmith that allow you to perform actions on traces in near real-time. This can be used to automatically score traces, send them to annotation queues, or send them to datasets.To define an automation, simply provide a filter condition, a sampling rate, and an action to perform. Automations are particularly helpful for processing traces at production scale.Threads\u200bMany LLM applications are multi-turn, meaning that they involve a series of interactions between the user and the application. LangSmith provides a threads view that groups traces from a single conversation together, making it easier to"), Document(metadata={'source': 'https://docs.smith.langchain.com/user_guide', 'title': 'LangSmith User Guide | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.', 'language': 'en'}, page_content='LangSmith User Guide | 🦜️🛠️ LangSmith')], 'answer': "LangSmith offers several features that can help you test your LLM applications:\n\n1. **Initial Test Set**: LangSmith allows you to create datasets, which are collections of inputs and reference outputs. You can use these datasets to run tests on your LLM applications. These test cases can be uploaded in bulk, created on the fly, or exported from application traces. LangSmith also makes it easy to run custom evaluations (both LLM and heuristic based) to score test results.\n\n2. **Comparison View**: When you're prototyping different versions of your applications and making changes, LangSmith provides a comparison view for test runs. This allows you to track and diagnose regressions in test scores across multiple revisions of your application by viewing results for different configurations on the same data points side-by-side.\n\n3. **Playground**: LangSmith's playground environment allows for rapid iteration and experimentation. You can quickly test out different prompts and models, and every playground run is logged in the system. These runs can be used to create test cases or compare with other runs.\n\n4. **Beta Testing**: During the beta testing phase, LangSmith supports capturing feedback and annotating traces. You can attach feedback scores to logged traces and filter on traces that have specific feedback tags and scores. LangSmith also supports sending runs to annotation queues for closer inspection and annotation by PMs, engineers, or subject matter experts.\n\n5. **Production Monitoring**: Once your application is in production, LangSmith provides monitoring charts to track key metrics over time. You can also use tag and metadata grouping to mark different versions of your applications and view their performance side-by-side, which is helpful for A/B testing.\n\n6. **Automations**: LangSmith's automations feature allows you to perform actions on traces in near real-time, such as automatically scoring traces, sending them to annotation queues, or adding them to datasets.\n\n7. **Threads**: For multi-turn LLM applications, LangSmith provides a threads view that groups traces from a single conversation together, making it easier to track the performance of your application across multiple interactions.\n\nBy leveraging these features, LangSmith can significantly aid in the testing and evaluation of your LLM applications throughout their development lifecycle."}

5 LLM Retrieval Agent

import os
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain.agents import AgentExecutor
from langchain_openai import ChatOpenAI
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.tools.tavily_search import TavilySearchResults
from langchain.tools.retriever import create_retriever_tool

llm = ChatOpenAI(model="deepseek-chat", api_key=DEEPSEEK_API_KEY, base_url=BASE_URL)

# data load / split
loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter()
documents_split = text_splitter.split_documents(documents)
# embedding / embedding db / retriever
embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL_NAME)
vector = FAISS.from_documents(documents_split, embeddings)
retriever = vector.as_retriever()

# tool
os.environ["TAVILY_API_KEY"] = TAVILY_API_KEY
retriever_tool = create_retriever_tool(retriever, "langsmith_search", "搜索与LangSmith相关的信息。有关LangSmith的任何问题,您必须使用此工具!")
search_tool = TavilySearchResults(max_results=3)
tools = [retriever_tool, search_tool]

# agent
prompt = hub.pull("hwchase17/openai-functions-agent")
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
res = agent_executor.invoke({"input": "langsmith如何帮助测试?"})
print(res)

LLM Agent的回复如下,使用了search工具和retriever工具:

> Entering new AgentExecutor chain...
LangSmith 是一个用于测试和调试语言模型(如 GPT-3、GPT-4 等)的工具。它提供了一系列功能,帮助开发者更有效地测试和优化他们的模型。以下是 LangSmith 如何帮助测试的一些关键点:

1. **交互式测试**:LangSmith 允许开发者通过一个直观的界面与语言模型进行交互,实时查看模型的响应,并进行调整和测试。
2. **日志和跟踪**:LangSmith 记录模型的每一次调用和响应,包括输入、输出、时间戳和任何错误信息。这有助于开发者追踪问题并进行调试。
3. **性能分析**:LangSmith 提供性能分析工具,帮助开发者了解模型的响应时间、资源消耗等关键指标,从而优化模型的性能。
4. **版本控制**:LangSmith 支持模型的版本控制,开发者可以轻松切换不同的模型版本进行测试和比较。
5. **集成测试**:LangSmith 可以与其他测试工具和框架集成,如 CI/CD 管道,确保模型在不同环境和场景下的稳定性和可靠性。
6. **数据标注和评估**:LangSmith 提供数据标注工具,帮助开发者创建和维护测试数据集,并进行模型评估,以确保模型的准确性和可靠性。
7. **自动化测试**:LangSmith 支持自动化测试脚本,开发者可以编写测试用例并自动执行,提高测试效率。

通过这些功能,LangSmith 帮助开发者更系统地测试和优化他们的语言模型,确保模型在实际应用中的性能和可靠性。

> Finished chain.