• Semantic Kernel 通过 LocalAI 集成本地模型


    本文是基于 LLama 2是由Meta 开源的大语言模型,通过LocalAI 来集成LLama2 来演示Semantic kernel(简称SK) 和 本地大模型的集成示例。
    SK 可以支持各种大模型,在官方示例中多是OpenAI 和 Azure OpenAI service 的GPT 3.5+。今天我们就来看一看如何把SK 和 本地部署的开源大模型集成起来。我们使用MIT协议的开源项目“LocalAI“:https://github.com/go-skynet/LocalAI
    LocalAI 是一个本地推理框架,提供了 RESTFul API,与 OpenAI API 规范兼容。它允许你在消费级硬件上本地或者在自有服务器上运行 LLM(和其他模型),支持与 ggml 格式兼容的多种模型家族。不需要 GPU。LocalAI 使用 C++ 绑定来优化速度。 它基于用于音频转录的 llama.cpp、gpt4all、rwkv.cpp、ggml、whisper.cpp 和用于嵌入的 bert.cpp。

     image


    可参考官方 Getting Started 进行部署,通过LocalAI我们将本地部署的大模型转换为OpenAI的格式,通过SK 的OpenAI 的Connector 访问,这里需要做的是把openai的Endpoint 指向 LocalAI,这个我们可以通过一个自定义的HttpClient来完成这项工作,例如下面的这个示例:

    internal class OpenAIHttpclientHandler : HttpClientHandler

    {

        private  KernelSettings _kernelSettings;

        public OpenAIHttpclientHandler(KernelSettings settings)

        {

            this._kernelSettings = settings;

        }

        protected override async Task SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)

        {

            if (request.RequestUri.LocalPath == "/v1/chat/completions")

            {

                UriBuilder uriBuilder = new UriBuilder(request.RequestUri)

                {

                    Scheme = this._kernelSettings.Scheme,

                    Host = this._kernelSettings.Host,

                    Port = this._kernelSettings.Port

                };

                request.RequestUri = uriBuilder.Uri;

            }

            return await base.SendAsync(request, cancellationToken);

        }

    }

    上面我们做好了所有的准备工作,接下来就是要把所有的组件组装起来,让它们协同工作。因此打开Visual studio code 创建一个c# 项目sk-csharp-hello-world,其中Program.cs 内容如下:

    using System.Reflection;

    using config;

    using Microsoft.Extensions.DependencyInjection;

    using Microsoft.Extensions.Logging;

    using Microsoft.SemanticKernel;

    using Microsoft.SemanticKernel.ChatCompletion;

    using Microsoft.SemanticKernel.Connectors.OpenAI;

    using Microsoft.SemanticKernel.PromptTemplates.Handlebars;

    using Plugins;

    var kernelSettings = KernelSettings.LoadSettings();

    var handler = new OpenAIHttpclientHandler(kernelSettings);

    IKernelBuilder builder = Kernel.CreateBuilder();

    builder.Services.AddLogging(c => c.SetMinimumLevel(LogLevel.Information).AddDebug());

    builder.AddChatCompletionService(kernelSettings,handler);

    builder.Plugins.AddFromType();

    Kernel kernel = builder.Build();

    // Load prompt from resource

    using StreamReader reader = new(Assembly.GetExecutingAssembly().GetManifestResourceStream("prompts.Chat.yaml")!);

    KernelFunction prompt = kernel.CreateFunctionFromPromptYaml(

        reader.ReadToEnd(),

        promptTemplateFactory: new HandlebarsPromptTemplateFactory()

    );

    // Create the chat history

    ChatHistory chatMessages = [];

    // Loop till we are cancelled

    while (true)

    {

        // Get user input

        System.Console.Write("User > ");

        chatMessages.AddUserMessage(Console.ReadLine()!);

        // Get the chat completions

        OpenAIPromptExecutionSettings openAIPromptExecutionSettings = new()

        {

        };

        var result = kernel.InvokeStreamingAsync(

            prompt,

            arguments: new KernelArguments(openAIPromptExecutionSettings) {

                { "messages", chatMessages }

            });

        // Print the chat completions

        ChatMessageContent? chatMessageContent = null;

        await foreach (var content in result)

        {

            System.Console.Write(content);

            if (chatMessageContent == null)

            {

                System.Console.Write("Assistant > ");

                chatMessageContent = new ChatMessageContent(

                    content.Role ?? AuthorRole.Assistant,

                    content.ModelId!,

                    content.Content!,

                    content.InnerContent,

                    content.Encoding,

                    content.Metadata);

            }

            else

            {

                chatMessageContent.Content += content;

            }

        }

        System.Console.WriteLine();

        chatMessages.Add(chatMessageContent!);

    }

    首先,我们做的第一件事是导入一堆必要的命名空间,使一切正常(第 1 行到第 9 行)。

    然后,我们创建一个内核构建器的实例(通过模式,而不是因为它是构造函数),这将有助于塑造我们的内核。

    IKernelBuilder builder = Kernel.CreateBuilder();

    你需要知道每时每刻都在发生什么吗?答案是肯定的!让我们在内核中添加一个日志。我们在第14行添加了日志的支持。

    我们想使用Azure,OpenAI中使用Microsoft的AI模型,以及我们LocalAI 集成的本地大模型,我们可以将它们包含在我们的内核中。正如我们在15行看到的那样:

    internal static class ServiceCollectionExtensions
    {
         ///


         /// Adds a chat completion service to the list. It can be either an OpenAI or Azure OpenAI backend service.
         ///

         ///
         ///
         ///
         internal static IKernelBuilder AddChatCompletionService(this IKernelBuilder kernelBuilder, KernelSettings kernelSettings, HttpClientHandler handler)
         {
           
             switch (kernelSettings.ServiceType.ToUpperInvariant())
             {
                 case ServiceTypes.AzureOpenAI:
                     kernelBuilder = kernelBuilder.AddAzureOpenAIChatCompletion(kernelSettings.DeploymentId,  endpoint: kernelSettings.Endpoint, apiKey: kernelSettings.ApiKey, serviceId: kernelSettings.ServiceId, kernelSettings.ModelId);
                     break;

                case ServiceTypes.OpenAI:
                     kernelBuilder = kernelBuilder.AddOpenAIChatCompletion(modelId: kernelSettings.ModelId, apiKey: kernelSettings.ApiKey, orgId: kernelSettings.OrgId, serviceId: kernelSettings.ServiceId);
                     break;

                case ServiceTypes.HunyuanAI:               
                     kernelBuilder = kernelBuilder.AddOpenAIChatCompletion(modelId: kernelSettings.ModelId, apiKey: kernelSettings.ApiKey, httpClient: new HttpClient(handler));
                     break;
                 case ServiceTypes.LocalAI:
                     kernelBuilder = kernelBuilder.AddOpenAIChatCompletion(modelId: kernelSettings.ModelId, apiKey: kernelSettings.ApiKey, httpClient: new HttpClient(handler));
                     break;
                 default:
                     throw new ArgumentException($"Invalid service type value: {kernelSettings.ServiceType}");
             }

            return kernelBuilder;
         }
    }

    接下来开启一个聊天循环,使用SK的流式传输 InvokeStreamingAsync,如第42行到46行代码所示,运行起来就可以体验下列的效果:

    image

    本文示例源代码:https://github.com/geffzhang/sk-csharp-hello-world


    参考文章:

  • 相关阅读:
    FFmpeg的API库介绍
    (15)腾讯云微搭:云开发为底层支撑的低代码应用/表单开发平台
    CSS3基础
    Android /android_vendor.32_arm64_armv8-a_shared/libtinyals a.so.abidiff报错
    数据库与缓存更新一致性四种策略
    JS 算法专辑【4】链表
    【Proteus仿真】【STM32单片机】太阳能热水器控制系统设计
    webSocket推送太快导致前端渲染卡顿问题优化
    华为仓颉编程语言观感
    集线器-交换机-路由器
  • 原文地址:https://www.cnblogs.com/shanyou/p/17988059