Compare commits

..

22 Commits

Author SHA1 Message Date
lightislost 04ccb1f812
Merge pull request #34 from codefuse-ai/pr_mv_muagent
[feature](coagent)<mv coagent to ~/CodeFuse-muAgent project>
2024-04-23 17:26:45 +08:00
shanshi 9b419c2dde [feature](coagent)<mv coagent to ~/CodeFuse-muAgent project> 2024-04-23 16:44:13 +08:00
lightislost eee0e09ee1
Merge pull request #33 from codefuse-ai/pr_webui
[feature](webui)<add config_webui for starting app>
2024-03-28 20:16:00 +08:00
shanshi 2d726185f8 [feature](webui)<add config_webui for starting app> 2024-03-28 20:12:36 +08:00
lightislost fef3e85061
Merge pull request #30 from GeorgeGalway/fix_issue#29
[fix issue#29] api error
2024-03-13 15:59:30 +08:00
高培文 74cab0fd4c [fix issue#29]EMBEDDING_MODEL error 2024-03-13 14:18:18 +08:00
高培文 1c0be9caf2 [fix issue#29] api error 2024-03-13 13:04:30 +08:00
lightislost 333f1e97c6
Merge pull request #28 from codefuse-ai/ma_demo_commit
[feature](coagent)<增加antflow兼容和增加coagent demo>
2024-03-12 15:40:06 +08:00
shanshi 4d9b268a98 [feature](coagent)<增加antflow兼容和增加coagent demo> 2024-03-12 15:31:06 +08:00
lightislost c14b41ecec
Merge pull request #25 from zhangw/chroma-telemetry-disable
disable the posthog telemetry mechnism that may raise the connection error
2024-03-01 14:07:44 +08:00
vincent 7b2e1b0c5a disable posthog telemetry for chromadb 2024-02-29 00:37:05 +08:00
lightislost 3b016353da
Merge pull request #24 from codefuse-ai/issue23
[fix issue#23] import error
2024-02-19 10:51:23 +08:00
shanshi 8c57a9c9b2 [fix issue#23] import error 2024-02-19 10:38:32 +08:00
shanshi 66d029d276 update static wehchat.png 2024-01-31 10:41:27 +08:00
shanshi 2b14ca3188 delete duplicate checkboxes 2024-01-30 15:56:29 +08:00
shanshi c5dfc36af1 add issue template 2024-01-29 14:43:36 +08:00
shanshi f83c1c397d add en-zh doc link 2024-01-29 12:15:27 +08:00
shanshi 0fa4b8222c update readmes docs about coagent 2024-01-29 11:41:06 +08:00
Haotian Zhu b7fdf50da7
Merge pull request #16 from codefuse-ai/coagent_branch
rename dev_opsgpt to coagent, and add memory&prompt manager
2024-01-29 11:09:09 +08:00
shanshi b0091a64a3 rename dev_opsgpt to coagent, and add memory&prompt manager 2024-01-26 14:03:25 +08:00
Haotian Zhu f13234f0fc
Add files via upload 2024-01-23 11:35:29 +08:00
Haotian Zhu b76aa6d28d
Add files via upload 2024-01-22 19:09:33 +08:00
236 changed files with 6346 additions and 16022 deletions

190
.github/CONTRIBUTING.md vendored Normal file
View File

@ -0,0 +1,190 @@
Thank you for your interest in the Codefuse project. We warmly welcome any suggestions, opinions (including criticisms), comments, and contributions to the Codefuse project.
Your suggestions, opinions, and comments on Codefuse can be directly submitted through GitHub Issues.
There are many ways to participate in the Codefuse project and contribute to it: code implementation, test writing, process tool improvement, documentation enhancement, and more. We welcome any contributions and will add you to our list of contributors.
Furthermore, with enough contributions, you may have the opportunity to become a Committer for Codefuse.
For any questions, you can contact us for timely answers through various means including WeChat, Gitter (an instant messaging tool provided by GitHub), email, and more.
## Getting Started
If you are new to the Codefuse community, you can:
- Follow the Codefuse GitHub repository.
- Join related WeChat groups for Codefuse to ask questions at any time;
Through the above methods, you can stay up-to-date with the development dynamics of the Codefuse project and express your opinions on topics of interest.
## Contributation Ways
This contribution guide is not just about writing code. We value and appreciate help in all areas. Here are some ways you can contribute:
- Documentation
- Issues
- Pull Requests (PR)
### Improve Documentation
Documentation is the main way for you to understand Codefuse and is also where we need the most help!
By browsing the documentation, you can deepen your understanding of Codefuse and also help you grasp the features and technical details of Codefuse. If you find any issues with the documentation, please contact us in time;
If you are interested in improving the quality of the documentation, whether it is revising an address of a page, correcting a link, or writing a better introductory document, we are very welcoming!
Most of our documentation is written in markdown format. You can directly modify and submit documentation changes in the docs/ directory on GitHub. For submitting code changes, please refer to Pull Requests.
### If You Discover a Bug or Issue
If you discover a bug or issue, you can directly submit a new Issue through GitHub Issues, and someone will handle it regularly. For more details, see Issue Template.[Issue Template](#issue-template)
You can also choose to read and analyze the code to fix it yourself (it is best to communicate with us before doing so, as someone might already be working on the same issue), and then submit a Pull Request.
### Modify Code and Submit a PR (Pull Request)
You can download the code, compile, install, and deploy to try it out (you can refer to the compilation documentation to see if it works as you expected). If there are any issues, you can directly contact us, submit an Issue, or fix it yourself by reading and analyzing the source code. For more details, see[Contribution](#contribution)
Whether it's fixing a bug or adding a feature, we warmly welcome it. If you wish to submit code to Doris, you need to fork the code repository to your project space on GitHub, create a new branch for your submitted code, add the original project as an upstream, and submit a PR. The method for submitting a PR can be referenced in the Pull Request documentation.
## Issue Type
Issues can be categorized into three types:
- Bug: Issues where code or execution examples contain bugs or lack dependencies, resulting in incorrect execution.
- Documentation: Discrepancies in documentation, inconsistencies between documentation content and code, etc.
- Feature: New functionalities that evolve from the current codebase.
## Issue Template
### Issue: Bug Template
**Checklist before submitting an issue**
<br>Please confirm that you have checked the document, issues, discussions (GitHub feature), and other publicly available documentation.
- I have searched through all documentation related to Codefuse.
- I used GitHub search to find a similar issue, but did not find one.
- I have added a very descriptive title for this issue.
**System Information**
<br>Please confirm your operating system, such as mac-xx, windows-xx, linux-xx.
**Code Version**
<br>Please confirm the code version or branch, such as master, release, etc.
**Problem Description**
<br>Describe the problem you encountered, what you want to achieve, or the bug encountered during code execution.
**Code Example**
<br>Attach your execution code and relevant configuration to facilitate rapid intervention and reproduction.
**Error Information, Logs**
<br>The error logs and related information after executing the above code example.
**Related Dependencies**
<br>Taking the chatbot project as an example:
- connector
- codechat
- sandbox
- ...
### Issue: Documentation Template
**Issue with current documentation:**
<br>Please point out any problems, typos, or confusing points in the current documentation.
**Idea or request for content**
<br>What do you think would be a reasonable way to express the documentation?
### Issue: Feature Template
**Checklist before submitting an issue**
<br>Please confirm that you have checked the document, issues, discussions (GitHub feature), and other publicly available documentation.
- I have searched through all documentation related to Codefuse.
- I used GitHub Issue search to find a similar issue, but did not find one.
- I have added a very descriptive title for this issue.
**Feature Description**
<br>Describe the purpose of this feature.
**Related Examples**
<br>Provide references to documents, repositories, etc., Please provide links to any relevant GitHub repos, papers, or other resources if relevant.
**Motivation**
<br>Describe the motivation for this feature. Why is it needed? Provide enough context information to help understand the demand for this feature.
**Contribution**
<br>How you can contribute to the building of this feature (if you are participating).
## Contribution
### Pre-Checklist
- First, confirm whether you have checked the document, issue, discussion (GitHub features), or other publicly available documentation.
- Find the GitHub issue you want to address. If none exists, create an issue or draft PR and ask a Maintainer for a check
- Check for related, similar, or duplicate pull requests
- Create a draft pull request
- Complete the PR template for the description
- Link any GitHub issue(s) that are resolved by your PR
### Description
A description of the PR should be articulated in concise language, highlighting the work completed by the PR. See specific standards at[Commit Format Specification](#Commit-Format-Specification)
### Related Issue
#xx if has
### Test Code with Result
Please provide relevant test code when necessary.
## Commit Format Specification
A commit consists of a "title" and a "body." The title should generally be in lowercase, while the first letter of the body should be uppercase.
### Title
The title of the commit message: `[<type>](<scope>) <subject> (#pr)`
### Type - Available Options
本次提交的类型,限定在以下类型(全小写)
- fix: Bug fixes
- feature: New features
- feature-wip: Features that are currently in development, such as partial code for a function.
- improvement: Optimizations and improvements to existing features
- style: Adjustments to code style
- typo: Typographical errors in code or documentation
- refactor: Code refactoring (without changing functionality)
- performance/optimize: Performance optimization
- test: Addition or fix of unit tests
- deps: Modifications to third-party dependencies
- community: Community-related changes, such as modifying Github Issue templates, etc.
Please note:
If multiple types occur in one commit, add multiple types.
If code refactoring leads to performance improvement, both [refactor][optimize] can be added.
Other types not listed above should not appear. If necessary, new types must be added to this document.
### Scope - Available Options
The scope of the modules involved in the current submission. Due to the multitude of functional modules, only a few are listed here, and this list will be updated continuously based on needs.
For example, using a chatbot framework:
connector
codechat
sandbox
...
Please note:
Try to use options that are already listed. If you need to add new ones, please update this document promptly.
### Subject Content
The title should clearly indicate the main content of the current submission.
## Example
comming soon
## Reference
[doris-commit-format](https://doris.apache.org/zh-CN/community/how-to-contribute/commit-format-specification)

140
.github/ISSUE_TEMPLATE/bug.yml vendored Normal file
View File

@ -0,0 +1,140 @@
name: "\U0001F41B Bug Report"
description: Report a bug in Codefuse. To report a security issue, please instead use the security option below.
labels: ["01 Bug Report"]
body:
- type: markdown
attributes:
value: >
Thank you for taking the time to file a bug report.
Use this to report bugs in Codefuse.
If you're not certain that your issue is due to a bug in Codefuse, please use [GitHub Discussions](https://github.com/codefuse-ai/codefuse-chatbot/discussions)
to ask for help with your issue.
We warmly welcome any suggestions, opinions (including criticisms), comments, and contributions to the Codefuse project.
Relevant links to check before filing a bug report to see if your issue has already been reported, fixed or
if there's another way to solve your problem:
[API Reference](https://codefuse-ai.github.io/),
[GitHub search](https://github.com/codefuse-ai/codefuse-chatbot),
[Chatbot Github Discussions](https://github.com/codefuse-ai/codefuse-chatbot/discussions),
[Chatbot Github Issues](https://github.com/codefuse-ai/codefuse-chatbot/issues)
- type: checkboxes
id: checks
attributes:
label: Checked other resources
description: Please confirm and check all the following options.
options:
- label: I searched the Codefuse documentation with the integrated search.
required: true
- label: I used the GitHub search to find a similar question and didn't find it.
required: true
- label: I am sure that this is a bug in Codefuse-Repos rather than my code.
required: true
- label: I added a very descriptive title to this issue.
required: true
- type: dropdown
id: system-info
attributes:
label: System Info
description: >
Please select the operating system you were using to run codefuse-ai/repos when this problem occurred.
options:
- Windows
- Linux
- MacOS
- Docker
- Devcontainer / Codespace
- Windows Subsystem for Linux (WSL)
- Other
validations:
required: true
nested_fields:
- type: text
attributes:
label: Specify the system
description: Please specify the system you are working on.
- type: dropdown
attributes:
label: Code Version
description: |
Please select which version of Codefuse-Repos you were using when this issue occurred.
**If you weren't please try with the **.
If installed with git you can run `git branch` to see which version of codefuse-ai you are running.
options:
- Latest Release
- Stable (branch)
- Master (branch)
validations:
required: true
- type: textarea
id: description
attributes:
label: Description
description: |
What is the problem, question, or error?
Write a short description telling what you are doing, what you expect to happen, and what is currently happening.
placeholder: |
* I'm trying to use the `coagent` library to do X.
* I expect to see Y.
* Instead, it does Z.
validations:
required: true
- type: textarea
id: reproduction
validations:
required: true
attributes:
label: Example Code
description: |
Please add a self-contained, [minimal, reproducible, example](https://stackoverflow.com/help/minimal-reproducible-example) with your use case.
If a maintainer can copy it, run it, and see it right away, there's a much higher chance that you'll be able to get help.
**Important!**
* Use code tags (e.g., ```python ... ```) to correctly [format your code](https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting).
* INCLUDE the language label (e.g. `python`) after the first three backticks to enable syntax highlighting. (e.g., ```python rather than ```).
* Reduce your code to the minimum required to reproduce the issue if possible. This makes it much easier for others to help you.
* Avoid screenshots when possible, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
placeholder: |
The following code:
```python
from coagent.tools import toLangchainTools, TOOL_DICT, TOOL_SETS
from coagent.connector.phase import BasePhase
from coagent.connector.schema import Message
phase_name = "baseGroupPhase"
phase = BasePhase(
phase_name, embed_config=embed_config, llm_config=llm_config,
)
query_content = "确认本地是否存在employee_data.csv并查看它有哪些列和数据类型;然后画柱状图"
query = Message(
role_name="human", role_type="user", tools=[],
role_content=query_content, input_query=query_content, origin_query=query_content,
)
output_message, output_memory = phase.step(query)
```
- type: textarea
id: error
validations:
required: false
attributes:
label: Error Message and Stack Trace (if applicable)
description: |
If you are reporting an error, please include the full error message and stack trace.
placeholder: |
Exception + full stack trace

View File

@ -0,0 +1,19 @@
name: Documentation
description: Report an issue related to the Codefuse documentation.
title: "DOC: <Please write a comprehensive title after the 'DOC: ' prefix>"
labels: [02 - Documentation]
body:
- type: textarea
attributes:
label: "Issue with current documentation:"
description: >
Please make sure to leave a reference to the document/code you're
referring to.
- type: textarea
attributes:
label: "Idea or request for content:"
description: >
Please describe as clearly as possible what topics you think are missing
from the current documentation.

35
.github/ISSUE_TEMPLATE/features.yml vendored Normal file
View File

@ -0,0 +1,35 @@
name: Feature request 🚀
description: Suggest a new idea for Codefuse!
labels: ['03 New Features']
body:
- type: markdown
attributes:
value: |
First, check out our [wiki page on Contributing](https://github.com/Significant-Gravitas/Nexus/wiki/Contributing)
Please provide a searchable summary of the issue in the title above ⬆️.
- type: checkboxes
id: checks
attributes:
label: Checked other resources
description: Please confirm and check all the following options.
options:
- label: I searched the Codefuse documentation with the integrated search.
required: true
- label: I used the GitHub search to find a similar question and didn't find it.
required: true
- type: textarea
attributes:
label: Summary 💡
description: Describe how it should work.
- type: textarea
attributes:
label: Examples 🌈
description: Provide a link to other implementations, or screenshots of the expected behavior.
- type: textarea
attributes:
label: Motivation 🔦
description: What are you trying to accomplish? How has the lack of this feature affected you? Providing context helps us come up with a solution that is more useful in the real world.

7
.gitignore vendored
View File

@ -1,6 +1,7 @@
**/__pycache__
knowledge_base
logs
llm_models
embedding_models
jupyter_work
model_config.py
@ -10,4 +11,10 @@ code_base
.DS_Store
.idea
data
.pyc
tests
*egg-info
build
dist
package.sh
local_config.json

View File

@ -3,7 +3,6 @@ From python:3.9.18-bookworm
WORKDIR /home/user
COPY ./requirements.txt /home/user/docker_requirements.txt
COPY ./jupyter_start.sh /home/user/jupyter_start.sh
RUN apt-get update

201
LICENSE
View File

@ -1,201 +0,0 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@ -5,6 +5,8 @@
# <p align="center">CodeFuse-ChatBot: Development by Private Knowledge Augmentation</p>
<p align="center">
<a href="README.md"><img src="https://img.shields.io/badge/文档-中文版-yellow.svg" alt="ZH doc"></a>
<a href="README_en.md"><img src="https://img.shields.io/badge/document-English-yellow.svg" alt="EN doc"></a>
<img src="https://img.shields.io/github/license/codefuse-ai/codefuse-chatbot" alt="License">
<a href="https://github.com/codefuse-ai/codefuse-chatbot/issues">
<img alt="Open Issues" src="https://img.shields.io/github/issues-raw/codefuse-ai/codefuse-chatbot" />
@ -12,10 +14,11 @@
<br><br>
</p>
DevOps-ChatBot是由蚂蚁CodeFuse团队开发的开源AI智能助手致力于简化和优化软件开发生命周期中的各个环节。该项目结合了Multi-Agent的协同调度机制并集成了丰富的工具库、代码库、知识库和沙盒环境使得LLM模型能够在DevOps领域内有效执行和处理复杂任务。
CodeFuse-ChatBot是由蚂蚁CodeFuse团队开发的开源AI智能助手致力于简化和优化软件开发生命周期中的各个环节。该项目结合了Multi-Agent的协同调度机制并集成了丰富的工具库、代码库、知识库和沙盒环境使得LLM模型能够在DevOps领域内有效执行和处理复杂任务。
## 🔔 更新
- [2024.01.29] 开放可配置化的multi-agent框架codefuse-muAgent详情见[使用说明](sources/readme_docs/coagent/coagent.md)
- [2023.12.26] 基于FastChat接入开源私有化大模型和大模型接口的能力开放
- [2023.12.14] 量子位公众号专题报道:[文章链接](https://mp.weixin.qq.com/s/MuPfayYTk9ZW6lcqgMpqKA)
- [2023.12.01] Multi-Agent和代码库检索功能开放
@ -38,7 +41,7 @@ DevOps-ChatBot是由蚂蚁CodeFuse团队开发的开源AI智能助手致力
💡 本项目旨在通过检索增强生成Retrieval Augmented GenerationRAG、工具学习Tool Learning和沙盒环境来构建软件开发全生命周期的AI智能助手涵盖设计、编码、测试、部署和运维等阶段。 逐渐从各处资料查询、独立分散平台操作的传统开发运维模式转变到大模型问答的智能化开发运维模式,改变人们的开发运维习惯。
本项目核心差异技术、功能点:
- **🧠 智能调度核心:** 构建了体系链路完善的调度核心,支持多模式一键配置,简化操作流程。 [使用说明](sources/readme_docs/multi-agent.md)
- **🧠 智能调度核心:** 构建了体系链路完善的调度核心,支持多模式一键配置,简化操作流程。 [使用说明](sources/readme_docs/coagent/coagent.md)
- **💻 代码整库分析:** 实现了仓库级的代码深入理解,以及项目文件级的代码编写与生成,提升了开发效率。
- **📄 文档分析增强:** 融合了文档知识库与知识图谱,通过检索和推理增强,为文档分析提供了更深层次的支持。
- **🔧 垂类专属知识:** 为DevOps领域定制的专属知识库支持垂类知识库的自助一键构建便捷实用。
@ -93,7 +96,13 @@ DevOps-ChatBot是由蚂蚁CodeFuse团队开发的开源AI智能助手致力
## 🚀 快速使用
### muagent-py
完整文档见:[CodeFuse-muAgent](sources/readme_docs/coagent/coagent.md)
```
pip install codefuse-muagent
```
### 使用ChatBot
请自行安装 nvidia 驱动程序,本项目已在 Python 3.9.18CUDA 11.7 环境下Windows、X86 架构的 macOS 系统中完成测试。
Docker安装、私有化LLM接入及相关启动问题见[快速使用明细](sources/readme_docs/start.md)
@ -114,60 +123,29 @@ cd codefuse-chatbot
pip install -r requirements.txt
```
2、基础配置
```bash
# 修改服务启动的基础配置
cd configs
cp model_config.py.example model_config.py
cp server_config.py.example server_config.py
# model_config#11~12 若需要使用openai接口openai接口key
os.environ["OPENAI_API_KEY"] = "sk-xxx"
# 可自行替换自己需要的api_base_url
os.environ["API_BASE_URL"] = "https://api.openai.com/v1"
# vi model_config#LLM_MODEL 你需要选择的语言模型
LLM_MODEL = "gpt-3.5-turbo"
LLM_MODELs = ["gpt-3.5-turbo"]
# vi model_config#EMBEDDING_MODEL 你需要选择的私有化向量模型
EMBEDDING_ENGINE = 'model'
EMBEDDING_MODEL = "text2vec-base"
# vi model_config#embedding_model_dict 修改成你的本地路径如果能直接连接huggingface则无需修改
# 若模型地址为:
model_dir: ~/codefuse-chatbot/embedding_models/shibing624/text2vec-base-chinese
# 配置如下
"text2vec-base": "shibing624/text2vec-base-chinese",
# vi server_config#8~14, 推荐采用容器启动服务
DOCKER_SERVICE = True
# 是否采用容器沙箱
SANDBOX_DO_REMOTE = True
# 是否采用api服务来进行
NO_REMOTE_API = True
```
3、启动服务
默认只启动webui相关服务未启动fastchat可选
```bash
# 若需要支撑codellama-34b-int4模型需要给fastchat打一个补丁
# cp examples/gptq.py ~/site-packages/fastchat/modules/gptq.py
# dev_opsgpt/service/llm_api.py#258 修改为 kwargs={"gptq_wbits": 4},
# start llm-service可选
python dev_opsgpt/service/llm_api.py
```
更多LLM接入方法见[详情...](sources/readme_docs/fastchat.md)
<br>
2、启动服务
```bash
# 完成server_config.py配置后可一键启动
cd examples
python start.py
bash start.sh
# 开始在页面进行相关配置,然后打开`启动对话服务`即可
```
<div align=center>
<img src="sources/docs_imgs/webui_config.png" alt="图片">
</div>
或者通过`start.py`进行启动[老版启动方式](sources/readme_docs/start.md)
更多LLM接入方法见[更多细节...](sources/readme_docs/fastchat.md)
<br>
## 贡献指南
非常感谢您对 Codefuse 项目感兴趣,我们非常欢迎您对 Codefuse 项目的各种建议、意见(包括批评)、评论和贡献。
您对 Codefuse 的各种建议、意见、评论可以直接通过 GitHub 的 Issues 提出。
参与 Codefuse 项目并为其作出贡献的方法有很多:代码实现、测试编写、流程工具改进、文档完善等等。任何贡献我们都会非常欢迎,并将您加入贡献者列表。详见[Contribution Guide...](sources/readme_docs/contribution/contribute_guide.md)
## 🤗 致谢

View File

@ -5,6 +5,8 @@
# <p align="center">Codefuse-ChatBot: Development by Private Knowledge Augmentation</p>
<p align="center">
<a href="README.md"><img src="https://img.shields.io/badge/文档-中文版-yellow.svg" alt="ZH doc"></a>
<a href="README_EN.md"><img src="https://img.shields.io/badge/document-英文版-yellow.svg" alt="EN doc"></a>
<img src="https://img.shields.io/github/license/codefuse-ai/codefuse-chatbot" alt="License">
<a href="https://github.com/codefuse-ai/codefuse-chatbot/issues">
<img alt="Open Issues" src="https://img.shields.io/github/issues-raw/codefuse-ai/codefuse-chatbot" />
@ -15,6 +17,8 @@ This project is an open-source AI intelligent assistant, specifically designed f
## 🔔 Updates
- [2024.01.29] A configurational multi-agent framework, codefuse-muagent, has been open-sourced. For more details, please refer to [codefuse-muagent](sources/readme_docs/coagent/coagent-en.md)
- [2023.12.26] Opening the capability to integrate with open-source private large models and large model interfaces based on FastChat
- [2023.12.01] Release of Multi-Agent and codebase retrieval functionalities.
- [2023.11.15] Addition of Q&A enhancement mode based on the local codebase.
- [2023.09.15] Launch of sandbox functionality for local/isolated environments, enabling knowledge retrieval from specified URLs using web crawlers.
@ -30,13 +34,13 @@ This project is an open-source AI intelligent assistant, specifically designed f
💡 The aim of this project is to construct an AI intelligent assistant for the entire lifecycle of software development, covering design, coding, testing, deployment, and operations, through Retrieval Augmented Generation (RAG), Tool Learning, and sandbox environments. It transitions gradually from the traditional development and operations mode of querying information from various sources and operating on standalone, disparate platforms to an intelligent development and operations mode based on large-model Q&A, changing people's development and operations habits.
- **🧠 Intelligent Scheduling Core:** Constructed a well-integrated scheduling core system that supports multi-mode one-click configuration, simplifying the operational process.
- **🧠 Intelligent Scheduling Core:** Constructed a well-integrated scheduling core system that supports multi-mode one-click configuration, simplifying the operational process. [codefuse-muagent](sources/readme_docs/coagent/coagent-en.md)
- **💻 Comprehensive Code Repository Analysis:** Achieved in-depth understanding at the repository level and coding and generation at the project file level, enhancing development efficiency.
- **📄 Enhanced Document Analysis:** Integrated document knowledge bases with knowledge graphs, providing deeper support for document analysis through enhanced retrieval and reasoning.
- **🔧 Industry-Specific Knowledge:** Tailored a specialized knowledge base for the DevOps domain, supporting the self-service one-click construction of industry-specific knowledge bases for convenience and practicality.
- **🤖 Compatible Models for Specific Verticals:** Designed small models specifically for the DevOps field, ensuring compatibility with related DevOps platforms and promoting the integration of the technological ecosystem.
🌍 Relying on open-source LLM and Embedding models, this project can achieve offline private deployments based on open-source models. Additionally, this project also supports the use of the OpenAI API.
🌍 Relying on open-source LLM and Embedding models, this project can achieve offline private deployments based on open-source models. Additionally, this project also supports the use of the OpenAI API.[Access Demo](sources/readme_docs/fastchat-en.md)
👥 The core development team has been long-term focused on research in the AIOps + NLP domain. We initiated the CodefuseGPT project, hoping that everyone could contribute high-quality development and operations documents widely, jointly perfecting this solution to achieve the goal of "Making Development Seamless for Everyone."
@ -64,7 +68,7 @@ This project is an open-source AI intelligent assistant, specifically designed f
- 💬 **LLM:**Supports various open-source models and LLM interfaces.
- 🛠️ **API Management:** Enables rapid integration of open-source components and operational platforms.
For implementation details, see: [Technical Route Details](sources/readme_docs/roadmap.md)
For implementation details, see: [Technical Route Details](sources/readme_docs/roadmap-en.md)
## 🌐 Model Integration
@ -79,7 +83,13 @@ If you need to integrate a specific model, please inform us of your requirements
## 🚀 Quick Start
### muagent-py
More Detail see[codefuse-muagent](sources/readme_docs/coagent/coagent-en.md)
```
pip install codefuse-muagent
```
### ChatBot-UI
Please install the Nvidia driver yourself; this project has been tested on Python 3.9.18, CUDA 11.7, Windows, and X86 architecture macOS systems.
1. Preparation of Python environment
@ -98,92 +108,28 @@ cd Codefuse-ChatBot
pip install -r requirements.txt
```
2. Preparation of Sandbox Environment
- Windows Docker installation:
[Docker Desktop for Windows](https://docs.docker.com/desktop/install/windows-install/) supports 64-bit versions of Windows 10 Pro, with Hyper-V enabled (not required for versions v1903 and above), or 64-bit versions of Windows 10 Home v1903 and above.
- [Comprehensive Detailed Windows 10 Docker Installation Tutorial](https://zhuanlan.zhihu.com/p/441965046)
- [Docker: From Beginner to Practitioner](https://yeasy.gitbook.io/docker_practice/install/windows)
- [Handling Docker Desktop requires the Server service to be enabled](https://blog.csdn.net/sunhy_csdn/article/details/106526991)
- [Install wsl or wait for error prompt](https://learn.microsoft.com/en-us/windows/wsl/install)
- Linux Docker Installation:
Linux installation is relatively simple, please search Baidu/Google for installation instructions.
- Mac Docker Installation
- [Docker: From Beginner to Practitioner](https://yeasy.gitbook.io/docker_practice/install/mac)
```bash
# Build images for the sandbox environment, see above for notebook version issues
bash docker_build.sh
```
3. Model Download (Optional)
If you need to use open-source LLM and Embed
ding models, you can download them from HuggingFace.
Here, we use THUDM/chatglm2-6b and text2vec-base-chinese as examples:
```
# install git-lfs
git lfs install
# install LLM-model
git lfs clone https://huggingface.co/THUDM/chatglm2-6b
# install Embedding-model
git lfs clone https://huggingface.co/shibing624/text2vec-base-chinese
```
4. Basic Configuration
```bash
# Modify the basic configuration for service startup
cd configs
cp model_config.py.example model_config.py
cp server_config.py.example server_config.py
# model_config#11~12 If you need to use the openai interface, openai interface key
os.environ["OPENAI_API_KEY"] = "sk-xxx"
# You can replace the api_base_url yourself
os.environ["API_BASE_URL"] = "https://api.openai.com/v1"
# vi model_config#105 You need to choose the language model
LLM_MODEL = "gpt-3.5-turbo"
# vi model_config#43 You need to choose the vector model
EMBEDDING_MODEL = "text2vec-base"
# vi model_config#25 Modify to your local path, if you can directly connect to huggingface, no modification is needed
"text2vec-base": "shibing624/text2vec-base-chinese",
# vi server_config#8~14, it is recommended to start the service using containers.
DOCKER_SERVICE = True
# Whether to use container sandboxing is up to your specific requirements and preferences
SANDBOX_DO_REMOTE = True
# Whether to use api-service to use chatbot
NO_REMOTE_API = True
```
5. Start the Service
By default, only webui related services are started, and fastchat is not started (optional).
```bash
# if use codellama-34b-int4, you should replace fastchat's gptq.py
# cp examples/gptq.py ~/site-packages/fastchat/modules/gptq.py
# dev_opsgpt/service/llm_api.py#258 => kwargs={"gptq_wbits": 4},
# start llm-service可选
python dev_opsgpt/service/llm_api.py
```
2. Start the Service
```bash
# After configuring server_config.py, you can start with just one click.
cd examples
bash start_webui.sh
bash start.sh
# you can config your llm model and embedding model, then choose the "启动对话服务"
```
<div align=center>
<img src="sources/docs_imgs/webui_config.png" alt="图片">
</div>
Or `python start.py` by [old version to start](sources/readme_docs/start-en.md)
More details about accessing LLM Moldes[More Details...](sources/readme_docs/fastchat.md)
<br>
## Contribution
Thank you for your interest in the Codefuse project. We warmly welcome any suggestions, opinions (including criticisms), comments, and contributions to the Codefuse project.
Your suggestions, opinions, and comments on Codefuse can be directly submitted through GitHub Issues.
There are many ways to participate in the Codefuse project and contribute to it: code implementation, test writing, process tool improvement, documentation enhancement, and more. We welcome any contributions and will add you to our list of contributors. See [contribution guide](sources/readme_docs/contribution/contribute_guide_en.md)
## 🤗 Acknowledgements
This project is based on [langchain-chatchat](https://github.com/chatchat-space/Langchain-Chatchat) and [codebox-api](https://github.com/shroominic/codebox-api). We deeply appreciate their contributions to open source!
This project is based on [langchain-chatchat](https://github.com/chatchat-space/Langchain-Chatchat) and [codebox-api](https://github.com/shroominic/codebox-api). We deeply appreciate their contributions to open source!

View File

@ -1,4 +0,0 @@
from .model_config import *
from .server_config import *
VERSION = "v0.1.0"

116
configs/default_config.py Normal file
View File

@ -0,0 +1,116 @@
import os
import platform
#
system_name = platform.system()
# 日志存储路径
LOG_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "logs")
# 知识库默认存储路径
SOURCE_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "sources")
# 知识库默认存储路径
KB_ROOT_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "knowledge_base")
# 代码库默认存储路径
CB_ROOT_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "code_base")
# nltk 模型存储路径
NLTK_DATA_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "nltk_data")
# 代码存储路径
JUPYTER_WORK_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "jupyter_work")
# WEB_CRAWL存储路径
WEB_CRAWL_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "knowledge_base")
# NEBULA_DATA存储路径
NEBULA_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "data/nebula_data")
# 语言模型存储路径
LOCAL_LLM_MODEL_DIR = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "llm_models")
# 向量模型存储路径
LOCAL_EM_MODEL_DIR = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "embedding_models")
# CHROMA 存储路径
CHROMA_PERSISTENT_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "data/chroma_data")
for _path in [LOG_PATH, SOURCE_PATH, KB_ROOT_PATH, CB_ROOT_PATH, NLTK_DATA_PATH, JUPYTER_WORK_PATH, WEB_CRAWL_PATH, NEBULA_PATH, CHROMA_PERSISTENT_PATH, LOCAL_LLM_MODEL_DIR, LOCAL_EM_MODEL_DIR]:
if not os.path.exists(_path):
os.makedirs(_path, exist_ok=True)
path_envt_dict = {
"LOG_PATH": LOG_PATH, "SOURCE_PATH": SOURCE_PATH, "KB_ROOT_PATH": KB_ROOT_PATH,
"NLTK_DATA_PATH":NLTK_DATA_PATH, "JUPYTER_WORK_PATH": JUPYTER_WORK_PATH,
"WEB_CRAWL_PATH": WEB_CRAWL_PATH, "NEBULA_PATH": NEBULA_PATH,
"CHROMA_PERSISTENT_PATH": CHROMA_PERSISTENT_PATH
}
for path_name, _path in path_envt_dict.items():
os.environ[path_name] = _path
# 数据库默认存储路径。
# 如果使用sqlite可以直接修改DB_ROOT_PATH如果使用其它数据库请直接修改SQLALCHEMY_DATABASE_URI。
DB_ROOT_PATH = os.path.join(KB_ROOT_PATH, "info.db")
SQLALCHEMY_DATABASE_URI = f"sqlite:///{DB_ROOT_PATH}"
# 可选向量库类型及对应配置
kbs_config = {
"faiss": {
},
# "milvus": {
# "host": "127.0.0.1",
# "port": "19530",
# "user": "",
# "password": "",
# "secure": False,
# },
# "pg": {
# "connection_uri": "postgresql://postgres:postgres@127.0.0.1:5432/langchain_chatchat",
# }
}
# 默认向量库类型。可选faiss, milvus, pg.
DEFAULT_VS_TYPE = "faiss"
# 缓存向量库数量
CACHED_VS_NUM = 1
# 知识库中单段文本长度
CHUNK_SIZE = 500
# 知识库中相邻文本重合长度
OVERLAP_SIZE = 50
# 知识库匹配向量数量
VECTOR_SEARCH_TOP_K = 5
# 知识库匹配相关度阈值取值范围在0-1之间SCORE越小相关度越高取到1相当于不筛选建议设置在0.5左右
# Mac 可能存在无法使用normalized_L2的问题因此调整SCORE_THRESHOLD至 0~1100
FAISS_NORMALIZE_L2 = True if system_name in ["Linux", "Windows"] else False
SCORE_THRESHOLD = 1 if system_name in ["Linux", "Windows"] else 1100
# 搜索引擎匹配结题数量
SEARCH_ENGINE_TOP_K = 5
# 代码引擎匹配结题数量
CODE_SEARCH_TOP_K = 1
# API 是否开启跨域默认为False如果需要开启请设置为True
# is open cross domain
OPEN_CROSS_DOMAIN = False
# Bing 搜索必备变量
# 使用 Bing 搜索需要使用 Bing Subscription Key,需要在azure port中申请试用bing search
# 具体申请方式请见
# https://learn.microsoft.com/en-us/bing/search-apis/bing-web-search/create-bing-search-service-resource
# 使用python创建bing api 搜索实例详见:
# https://learn.microsoft.com/en-us/bing/search-apis/bing-web-search/quickstarts/rest/python
BING_SEARCH_URL = "https://api.bing.microsoft.com/v7.0/search"
# 注意不是bing Webmaster Tools的api key
# 此外如果是在服务器上报Failed to establish a new connection: [Errno 110] Connection timed out
# 是因为服务器加了防火墙需要联系管理员加白名单如果公司的服务器的话就别想了GG
BING_SUBSCRIPTION_KEY = ""
# 是否开启中文标题加强,以及标题增强的相关配置
# 通过增加标题判断判断哪些文本为标题并在metadata中进行标记
# 然后将文本与往上一级的标题进行拼合,实现文本信息的增强。
ZH_TITLE_ENHANCE = False
log_verbose = False

View File

@ -4,30 +4,76 @@ import logging
import torch
import openai
import base64
import json
from .utils import is_running_in_docker
from .default_config import *
# 日志格式
LOG_FORMAT = "%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s: %(message)s"
logger = logging.getLogger()
logger.setLevel(logging.INFO)
logging.basicConfig(format=LOG_FORMAT)
# os.environ["OPENAI_PROXY"] = "socks5h://127.0.0.1:13659"
os.environ["API_BASE_URL"] = "http://openai.com/v1/chat/completions"
os.environ["OPENAI_API_KEY"] = ""
os.environ["DUCKDUCKGO_PROXY"] = os.environ.get("DUCKDUCKGO_PROXY") or "socks5://127.0.0.1:13659"
os.environ["BAIDU_OCR_API_KEY"] = ""
os.environ["BAIDU_OCR_SECRET_KEY"] = ""
VERSION = "v0.1.0"
import platform
system_name = platform.system()
try:
# ignore these content
from zdatafront import client, monkey, OPENAI_API_BASE
# patch openai sdk
monkey.patch_openai()
secret_key = base64.b64decode('xx').decode('utf-8')
# zdatafront 提供的统一加密密钥
client.aes_secret_key = secret_key
# zdatafront 分配的业务标记
client.visit_domain = os.environ.get("visit_domain")
client.visit_biz = os.environ.get("visit_biz")
client.visit_biz_line = os.environ.get("visit_biz_line")
except Exception as e:
OPENAI_API_BASE = "https://api.openai.com/v1"
logger.error(e)
pass
try:
cur_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)))
with open(os.path.join(cur_dir, "local_config.json"), "r") as f:
update_config = json.load(f)
except:
update_config = {}
# add your openai key
os.environ["API_BASE_URL"] = os.environ.get("API_BASE_URL") or update_config.get("API_BASE_URL") or OPENAI_API_BASE
os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY") or update_config.get("OPENAI_API_KEY") or "sk-xx"
openai.api_key = os.environ["OPENAI_API_KEY"]
# os.environ["OPENAI_PROXY"] = "socks5h://127.0.0.1:13659"
os.environ["DUCKDUCKGO_PROXY"] = os.environ.get("DUCKDUCKGO_PROXY") or update_config.get("DUCKDUCKGO_PROXY") or "socks5h://127.0.0.1:13659"
# ignore if you dont's use baidu_ocr_api
os.environ["BAIDU_OCR_API_KEY"] = "xx"
os.environ["BAIDU_OCR_SECRET_KEY"] = "xx"
os.environ["log_verbose"] = "2"
# LLM 名称
EMBEDDING_ENGINE = os.environ.get("EMBEDDING_ENGINE") or update_config.get("EMBEDDING_ENGINE") or 'model' # openai or model
EMBEDDING_MODEL = os.environ.get("EMBEDDING_MODEL") or update_config.get("EMBEDDING_MODEL") or "text2vec-base"
LLM_MODEL = os.environ.get("LLM_MODEL") or "gpt-3.5-turbo"
LLM_MODELs = [LLM_MODEL]
USE_FASTCHAT = "gpt" not in LLM_MODEL # 判断是否进行fastchat
# LLM 运行设备
LLM_DEVICE = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
# 在以下字典中修改属性值以指定本地embedding模型存储位置
# 如将 "text2vec": "GanymedeNil/text2vec-large-chinese" 修改为 "text2vec": "User/Downloads/text2vec-large-chinese"
# 此处请写绝对路径
embedding_model_dict = {
embedding_model_dict = json.loads(os.environ.get("embedding_model_dict")) if os.environ.get("embedding_model_dict") else {}
embedding_model_dict = embedding_model_dict or update_config.get("embedding_model_dict")
embedding_model_dict = embedding_model_dict or {
"ernie-tiny": "nghuyong/ernie-3.0-nano-zh",
"ernie-base": "nghuyong/ernie-3.0-base-zh",
"text2vec-base": "shibing624/text2vec-base-chinese",
"text2vec-base": "text2vec-base-chinese",
"text2vec": "GanymedeNil/text2vec-large-chinese",
"text2vec-paraphrase": "shibing624/text2vec-base-chinese-paraphrase",
"text2vec-sentence": "shibing624/text2vec-base-chinese-sentence",
@ -41,35 +87,35 @@ embedding_model_dict = {
}
LOCAL_MODEL_DIR = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "embedding_models")
embedding_model_dict = {k: f"/home/user/chatbot/embedding_models/{v}" if is_running_in_docker() else f"{LOCAL_MODEL_DIR}/{v}" for k, v in embedding_model_dict.items()}
# 选用的 Embedding 名称
EMBEDDING_ENGINE = 'openai'
EMBEDDING_MODEL = "text2vec-base"
# LOCAL_MODEL_DIR = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "embedding_models")
# embedding_model_dict = {k: f"/home/user/chatbot/embedding_models/{v}" if is_running_in_docker() else f"{LOCAL_MODEL_DIR}/{v}" for k, v in embedding_model_dict.items()}
# Embedding 模型运行设备
EMBEDDING_DEVICE = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
ONLINE_LLM_MODEL = {
ONLINE_LLM_MODEL = json.loads(os.environ.get("ONLINE_LLM_MODEL")) if os.environ.get("ONLINE_LLM_MODEL") else {}
ONLINE_LLM_MODEL = ONLINE_LLM_MODEL or update_config.get("ONLINE_LLM_MODEL")
ONLINE_LLM_MODEL = ONLINE_LLM_MODEL or {
# 线上模型。请在server_config中为每个在线API设置不同的端口
"openai-api": {
"model_name": "gpt-3.5-turbo",
"api_base_url": "https://api.openai.com/v1",
"api_base_url": OPENAI_API_BASE, # "https://api.openai.com/v1",
"api_key": "",
"openai_proxy": "",
},
"example": {
"version": "gpt-3.5", # 采用openai接口做示例
"api_base_url": "https://api.openai.com/v1",
"version": "gpt-3.5-turbo", # 采用openai接口做示例
"api_base_url": OPENAI_API_BASE, # "https://api.openai.com/v1",
"api_key": "",
"provider": "ExampleWorker",
},
}
# 建议使用chat模型不要使用base无法获取正确输出
llm_model_dict = {
llm_model_dict = json.loads(os.environ.get("llm_model_dict")) if os.environ.get("llm_model_dict") else {}
llm_model_dict = llm_model_dict or update_config.get("llm_model_dict")
llm_model_dict = llm_model_dict or {
"chatglm-6b": {
"local_model_path": "THUDM/chatglm-6b",
"api_base_url": "http://localhost:8888/v1", # "name"修改为fastchat服务中的"api_base_url"
@ -100,10 +146,27 @@ llm_model_dict = {
"api_base_url": os.environ.get("API_BASE_URL"),
"api_key": os.environ.get("OPENAI_API_KEY")
},
"gpt-3.5-turbo-0613": {
"local_model_path": "gpt-3.5-turbo-0613",
"api_base_url": os.environ.get("API_BASE_URL"),
"api_key": os.environ.get("OPENAI_API_KEY")
},
"gpt-4": {
"local_model_path": "gpt-4",
"api_base_url": os.environ.get("API_BASE_URL"),
"api_key": os.environ.get("OPENAI_API_KEY")
},
"gpt-3.5-turbo-1106": {
"local_model_path": "gpt-3.5-turbo-1106",
"api_base_url": os.environ.get("API_BASE_URL"),
"api_key": os.environ.get("OPENAI_API_KEY")
},
}
# 建议使用chat模型不要使用base无法获取正确输出
VLLM_MODEL_DICT = {
VLLM_MODEL_DICT = json.loads(os.environ.get("VLLM_MODEL_DICT")) if os.environ.get("VLLM_MODEL_DICT") else {}
VLLM_MODEL_DICT = VLLM_MODEL_DICT or update_config.get("VLLM_MODEL_DICT")
VLLM_MODEL_DICT = VLLM_MODEL_DICT or {
'chatglm2-6b': "THUDM/chatglm-6b",
}
# 以下模型经过测试可接入,配置仿照上述即可
@ -113,154 +176,21 @@ VLLM_MODEL_DICT = {
# 'chatglm3-6b-base', 'Qwen-72B-Chat-Int4'
LOCAL_LLM_MODEL_DIR = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "llm_models")
# 若不想修改模型地址,可取消相关地址设
llm_model_dict_c = {}
for k, v in llm_model_dict.items():
v_c = {}
for kk, vv in v.items():
if k=="local_model_path":
v_c[kk] = f"/home/user/chatbot/llm_models/{vv}" if is_running_in_docker() else f"{LOCAL_LLM_MODEL_DIR}/{vv}"
else:
v_c[kk] = vv
llm_model_dict_c[k] = v_c
# LOCAL_LLM_MODEL_DIR = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "llm_models")
# # 模型路径重
# llm_model_dict_c = {}
# for k, v in llm_model_dict.items():
# v_c = {}
# for kk, vv in v.items():
# if k=="local_model_path":
# v_c[kk] = f"/home/user/chatbot/llm_models/{vv}" if is_running_in_docker() else f"{LOCAL_LLM_MODEL_DIR}/{vv}"
# else:
# v_c[kk] = vv
# llm_model_dict_c[k] = v_c
llm_model_dict = llm_model_dict_c
# 若不想修改模型地址,可取消相关地址设置
VLLM_MODEL_DICT_c = {}
for k, v in VLLM_MODEL_DICT.items():
VLLM_MODEL_DICT_c[k] = f"/home/user/chatbot/llm_models/{v}" if is_running_in_docker() else f"{LOCAL_LLM_MODEL_DIR}/{v}"
VLLM_MODEL_DICT = VLLM_MODEL_DICT_c
# LLM 名称
# EMBEDDING_ENGINE = 'openai'
EMBEDDING_ENGINE = 'model'
EMBEDDING_MODEL = "text2vec-base"
# LLM_MODEL = "gpt-4"
LLM_MODEL = "gpt-3.5-turbo-16k"
LLM_MODELs = ["gpt-3.5-turbo-16k"]
USE_FASTCHAT = "gpt" not in LLM_MODEL # 判断是否进行fastchat
# LLM 运行设备
LLM_DEVICE = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
# 日志存储路径
LOG_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "logs")
if not os.path.exists(LOG_PATH):
os.mkdir(LOG_PATH)
# 知识库默认存储路径
SOURCE_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "sources")
# 知识库默认存储路径
KB_ROOT_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "knowledge_base")
# 代码库默认存储路径
CB_ROOT_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "code_base")
# nltk 模型存储路径
NLTK_DATA_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "nltk_data")
# 代码存储路径
JUPYTER_WORK_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "jupyter_work")
# WEB_CRAWL存储路径
WEB_CRAWL_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "knowledge_base")
# NEBULA_DATA存储路径
NELUBA_PATH = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "data/neluba_data")
for _path in [LOG_PATH, SOURCE_PATH, KB_ROOT_PATH, NLTK_DATA_PATH, JUPYTER_WORK_PATH, WEB_CRAWL_PATH, NELUBA_PATH]:
if not os.path.exists(_path):
os.makedirs(_path, exist_ok=True)
# 数据库默认存储路径。
# 如果使用sqlite可以直接修改DB_ROOT_PATH如果使用其它数据库请直接修改SQLALCHEMY_DATABASE_URI。
DB_ROOT_PATH = os.path.join(KB_ROOT_PATH, "info.db")
SQLALCHEMY_DATABASE_URI = f"sqlite:///{DB_ROOT_PATH}"
# 可选向量库类型及对应配置
kbs_config = {
"faiss": {
},
# "milvus": {
# "host": "127.0.0.1",
# "port": "19530",
# "user": "",
# "password": "",
# "secure": False,
# },
# "pg": {
# "connection_uri": "postgresql://postgres:postgres@127.0.0.1:5432/langchain_chatchat",
# }
}
# 默认向量库类型。可选faiss, milvus, pg.
DEFAULT_VS_TYPE = "faiss"
# 缓存向量库数量
CACHED_VS_NUM = 1
# 知识库中单段文本长度
CHUNK_SIZE = 500
# 知识库中相邻文本重合长度
OVERLAP_SIZE = 50
# 知识库匹配向量数量
VECTOR_SEARCH_TOP_K = 5
# 知识库匹配相关度阈值取值范围在0-1之间SCORE越小相关度越高取到1相当于不筛选建议设置在0.5左右
# Mac 可能存在无法使用normalized_L2的问题因此调整SCORE_THRESHOLD至 0~1100
FAISS_NORMALIZE_L2 = True if system_name in ["Linux", "Windows"] else False
SCORE_THRESHOLD = 1 if system_name in ["Linux", "Windows"] else 1100
# 搜索引擎匹配结题数量
SEARCH_ENGINE_TOP_K = 5
# 代码引擎匹配结题数量
CODE_SEARCH_TOP_K = 1
# 基于本地知识问答的提示词模版
PROMPT_TEMPLATE = """【指令】根据已知信息,简洁和专业的来回答问题。如果无法从中得到答案,请说 “根据已知信息无法回答该问题”,不允许在答案中添加编造成分,答案请使用中文。
【已知信息】{context}
【问题】{question}"""
# 基于本地代码知识问答的提示词模版
CODE_PROMPT_TEMPLATE = """【指令】根据已知信息来回答问题。
【已知信息】{context}
【问题】{question}"""
# 代码解释模版
CODE_INTERPERT_TEMPLATE = '''{code}
解释一下这段代码'''
# API 是否开启跨域默认为False如果需要开启请设置为True
# is open cross domain
OPEN_CROSS_DOMAIN = False
# Bing 搜索必备变量
# 使用 Bing 搜索需要使用 Bing Subscription Key,需要在azure port中申请试用bing search
# 具体申请方式请见
# https://learn.microsoft.com/en-us/bing/search-apis/bing-web-search/create-bing-search-service-resource
# 使用python创建bing api 搜索实例详见:
# https://learn.microsoft.com/en-us/bing/search-apis/bing-web-search/quickstarts/rest/python
BING_SEARCH_URL = "https://api.bing.microsoft.com/v7.0/search"
# 注意不是bing Webmaster Tools的api key
# 此外如果是在服务器上报Failed to establish a new connection: [Errno 110] Connection timed out
# 是因为服务器加了防火墙需要联系管理员加白名单如果公司的服务器的话就别想了GG
BING_SUBSCRIPTION_KEY = ""
# 是否开启中文标题加强,以及标题增强的相关配置
# 通过增加标题判断判断哪些文本为标题并在metadata中进行标记
# 然后将文本与往上一级的标题进行拼合,实现文本信息的增强。
ZH_TITLE_ENHANCE = False
log_verbose = False
# llm_model_dict = llm_model_dict_c
# #
# VLLM_MODEL_DICT_c = {}
# for k, v in VLLM_MODEL_DICT.items():
# VLLM_MODEL_DICT_c[k] = f"/home/user/chatbot/llm_models/{v}" if is_running_in_docker() else f"{LOCAL_LLM_MODEL_DIR}/{v}"
# VLLM_MODEL_DICT = VLLM_MODEL_DICT_c

View File

@ -1,17 +1,31 @@
from .model_config import LLM_MODEL, LLM_DEVICE
import os
import os, json
try:
cur_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)))
with open(os.path.join(cur_dir, "local_config.json"), "r") as f:
update_config = json.load(f)
except:
update_config = {}
# API 是否开启跨域默认为False如果需要开启请设置为True
# is open cross domain
OPEN_CROSS_DOMAIN = False
# 是否用容器来启动服务
DOCKER_SERVICE = True
try:
DOCKER_SERVICE = json.loads(os.environ["DOCKER_SERVICE"]) or update_config.get("DOCKER_SERVICE") or False
except:
DOCKER_SERVICE = True
# 是否采用容器沙箱
SANDBOX_DO_REMOTE = True
try:
SANDBOX_DO_REMOTE = json.loads(os.environ["SANDBOX_DO_REMOTE"]) or update_config.get("SANDBOX_DO_REMOTE") or False
except:
SANDBOX_DO_REMOTE = True
# 是否采用api服务来进行
NO_REMOTE_API = True
# 各服务器默认绑定host
DEFAULT_BIND_HOST = "127.0.0.1"
os.environ["DEFAULT_BIND_HOST"] = DEFAULT_BIND_HOST
#
CONTRAINER_NAME = "devopsgpt_webui"
@ -57,13 +71,10 @@ NEBULA_GRAPH_SERVER = {
"docker_port": NEBULA_PORT
}
# chroma conf
CHROMA_PERSISTENT_PATH = '/home/user/chatbot/data/chroma_data'
# sandbox api server
SANDBOX_CONTRAINER_NAME = "devopsgpt_sandbox"
SANDBOX_IMAGE_NAME = "devopsgpt:py39"
SANDBOX_HOST = os.environ.get("SANDBOX_HOST") or DEFAULT_BIND_HOST # "172.25.0.3"
SANDBOX_HOST = os.environ.get("SANDBOX_HOST") or update_config.get("SANDBOX_HOST") or DEFAULT_BIND_HOST # "172.25.0.3"
SANDBOX_SERVER = {
"host": f"http://{SANDBOX_HOST}",
"port": 5050,
@ -75,7 +86,10 @@ SANDBOX_SERVER = {
# fastchat model_worker server
# 这些模型必须是在model_config.llm_model_dict中正确配置的。
# 在启动startup.py时可用通过`--model-worker --model-name xxxx`指定模型不指定则为LLM_MODEL
FSCHAT_MODEL_WORKERS = {
# 建议使用chat模型不要使用base无法获取正确输出
FSCHAT_MODEL_WORKERS = json.loads(os.environ.get("FSCHAT_MODEL_WORKERS")) if os.environ.get("FSCHAT_MODEL_WORKERS") else {}
FSCHAT_MODEL_WORKERS = FSCHAT_MODEL_WORKERS or update_config.get("FSCHAT_MODEL_WORKERS")
FSCHAT_MODEL_WORKERS = FSCHAT_MODEL_WORKERS or {
"default": {
"host": DEFAULT_BIND_HOST,
"port": 20002,
@ -119,7 +133,9 @@ FSCHAT_MODEL_WORKERS = {
'chatglm3-6b-32k': {'host': DEFAULT_BIND_HOST, 'port': 20018},
'chatglm3-6b-base': {'host': DEFAULT_BIND_HOST, 'port': 20019},
'Qwen-72B-Chat-Int4': {'host': DEFAULT_BIND_HOST, 'port': 20020},
'gpt-3.5-turbo': {'host': DEFAULT_BIND_HOST, 'port': 20021}
'gpt-3.5-turbo': {'host': DEFAULT_BIND_HOST, 'port': 20021},
'example': {'host': DEFAULT_BIND_HOST, 'port': 20022},
'openai-api': {'host': DEFAULT_BIND_HOST, 'port': 20023}
}
# fastchat multi model worker server
FSCHAT_MULTI_MODEL_WORKERS = {

View File

@ -1,7 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: __init__.py.py
@time: 2023/11/9 下午4:01
@desc:
'''

View File

@ -1,11 +0,0 @@
from .base_chat import Chat
from .knowledge_chat import KnowledgeChat
from .llm_chat import LLMChat
from .search_chat import SearchChat
from .code_chat import CodeChat
from .agent_chat import AgentChat
__all__ = [
"Chat", "KnowledgeChat", "LLMChat", "SearchChat", "CodeChat", "AgentChat"
]

View File

@ -1,325 +0,0 @@
from fastapi import Body, Request
from fastapi.responses import StreamingResponse
from typing import List, Union, Dict
from loguru import logger
import importlib
import copy
import json
from pathlib import Path
from configs.model_config import (
llm_model_dict, LLM_MODEL, PROMPT_TEMPLATE,
VECTOR_SEARCH_TOP_K, SCORE_THRESHOLD)
from dev_opsgpt.tools import (
toLangchainTools,
TOOL_DICT, TOOL_SETS
)
from dev_opsgpt.connector.phase import BasePhase
from dev_opsgpt.connector.agents import BaseAgent, ReactAgent
from dev_opsgpt.connector.chains import BaseChain
from dev_opsgpt.connector.schema import (
Message,
load_phase_configs, load_chain_configs, load_role_configs
)
from dev_opsgpt.connector.schema import Memory
from dev_opsgpt.utils.common_utils import file_normalize
from dev_opsgpt.chat.utils import History, wrap_done
from dev_opsgpt.connector.configs import PHASE_CONFIGS, AGETN_CONFIGS, CHAIN_CONFIGS
PHASE_MODULE = importlib.import_module("dev_opsgpt.connector.phase")
class AgentChat:
def __init__(
self,
engine_name: str = "",
top_k: int = 1,
stream: bool = False,
) -> None:
self.top_k = top_k
self.stream = stream
self.chatPhase_dict: Dict[str, BasePhase] = {}
def chat(
self,
query: str = Body(..., description="用户输入", examples=["hello"]),
phase_name: str = Body(..., description="执行场景名称", examples=["chatPhase"]),
chain_name: str = Body(..., description="执行链的名称", examples=["chatChain"]),
history: List[History] = Body(
[], description="历史对话",
examples=[[{"role": "user", "content": "我们来玩成语接龙,我先来,生龙活虎"}]]
),
doc_engine_name: str = Body(..., description="知识库名称", examples=["samples"]),
search_engine_name: str = Body(..., description="搜索引擎名称", examples=["duckduckgo"]),
code_engine_name: str = Body(..., description="代码引擎名称", examples=["samples"]),
top_k: int = Body(VECTOR_SEARCH_TOP_K, description="匹配向量数"),
score_threshold: float = Body(SCORE_THRESHOLD, description="知识库匹配相关度阈值取值范围在0-1之间SCORE越小相关度越高取到1相当于不筛选建议设置在0.5左右", ge=0, le=1),
stream: bool = Body(False, description="流式输出"),
local_doc_url: bool = Body(False, description="知识文件返回本地路径(true)或URL(false)"),
choose_tools: List[str] = Body([], description="选择tool的集合"),
do_search: bool = Body(False, description="是否进行搜索"),
do_doc_retrieval: bool = Body(False, description="是否进行知识库检索"),
do_code_retrieval: bool = Body(False, description="是否执行代码检索"),
do_tool_retrieval: bool = Body(False, description="是否执行工具检索"),
custom_phase_configs: dict = Body({}, description="自定义phase配置"),
custom_chain_configs: dict = Body({}, description="自定义chain配置"),
custom_role_configs: dict = Body({}, description="自定义role配置"),
history_node_list: List = Body([], description="代码历史相关节点"),
isDetailed: bool = Body(False, description="是否输出完整的agent相关内容"),
upload_file: Union[str, Path, bytes] = "",
**kargs
) -> Message:
# update configs
phase_configs, chain_configs, agent_configs = self.update_configs(
custom_phase_configs, custom_chain_configs, custom_role_configs)
logger.info('phase_configs={}'.format(phase_configs))
logger.info('chain_configs={}'.format(chain_configs))
logger.info('agent_configs={}'.format(agent_configs))
logger.info('phase_name')
logger.info('chain_name')
# choose tools
tools = toLangchainTools([TOOL_DICT[i] for i in choose_tools if i in TOOL_DICT])
logger.debug(f"upload_file: {upload_file}")
if upload_file:
upload_file_name = upload_file if upload_file and isinstance(upload_file, str) else upload_file.name
for _filename_idx in range(len(upload_file_name), 0, -1):
if upload_file_name[:_filename_idx] in query:
query = query.replace(upload_file_name[:_filename_idx], upload_file_name)
break
input_message = Message(
role_content=query,
role_type="human",
role_name="user",
input_query=query,
origin_query=query,
phase_name=phase_name,
chain_name=chain_name,
do_search=do_search,
do_doc_retrieval=do_doc_retrieval,
do_code_retrieval=do_code_retrieval,
do_tool_retrieval=do_tool_retrieval,
doc_engine_name=doc_engine_name, search_engine_name=search_engine_name,
code_engine_name=code_engine_name,
score_threshold=score_threshold, top_k=top_k,
history_node_list=history_node_list,
tools=tools
)
# history memory mangemant
history = Memory(messages=[
Message(role_name=i["role"], role_type=i["role"], role_content=i["content"])
for i in history
])
# start to execute
phase_class = getattr(PHASE_MODULE, phase_configs[input_message.phase_name]["phase_type"])
phase = phase_class(input_message.phase_name,
task = input_message.task,
phase_config = phase_configs,
chain_config = chain_configs,
role_config = agent_configs,
do_summary=phase_configs[input_message.phase_name]["do_summary"],
do_code_retrieval=input_message.do_code_retrieval,
do_doc_retrieval=input_message.do_doc_retrieval,
do_search=input_message.do_search,
)
output_message, local_memory = phase.step(input_message, history)
# logger.debug(f"local_memory: {local_memory.to_str_messages(content_key='step_content')}")
# return {
# "answer": output_message.role_content,
# "db_docs": output_message.db_docs,
# "search_docs": output_message.search_docs,
# "code_docs": output_message.code_docs,
# "figures": output_message.figures
# }
def chat_iterator(message: Message, local_memory: Memory, isDetailed=False):
step_content = local_memory.to_str_messages(content_key='step_content', filter_roles=["user"])
final_content = message.role_content
result = {
"answer": "",
"db_docs": [str(doc) for doc in message.db_docs],
"search_docs": [str(doc) for doc in message.search_docs],
"code_docs": [str(doc) for doc in message.code_docs],
"related_nodes": [doc.get_related_node() for idx, doc in enumerate(message.code_docs) if idx==0],
"figures": message.figures,
"step_content": step_content,
"final_content": final_content,
}
related_nodes, has_nodes = [], [ ]
for nodes in result["related_nodes"]:
for node in nodes:
if node not in has_nodes:
related_nodes.append(node)
result["related_nodes"] = related_nodes
# logger.debug(f"{result['figures'].keys()}, isDetailed: {isDetailed}")
message_str = step_content
if self.stream:
for token in message_str:
result["answer"] = token
yield json.dumps(result, ensure_ascii=False)
else:
for token in message_str:
result["answer"] += token
yield json.dumps(result, ensure_ascii=False)
return StreamingResponse(chat_iterator(output_message, local_memory, isDetailed), media_type="text/event-stream")
def achat(
self,
query: str = Body(..., description="用户输入", examples=["hello"]),
phase_name: str = Body(..., description="执行场景名称", examples=["chatPhase"]),
chain_name: str = Body(..., description="执行链的名称", examples=["chatChain"]),
history: List[History] = Body(
[], description="历史对话",
examples=[[{"role": "user", "content": "我们来玩成语接龙,我先来,生龙活虎"}]]
),
doc_engine_name: str = Body(..., description="知识库名称", examples=["samples"]),
search_engine_name: str = Body(..., description="搜索引擎名称", examples=["duckduckgo"]),
code_engine_name: str = Body(..., description="代码引擎名称", examples=["samples"]),
cb_search_type: str = Body(..., description="代码查询模式", examples=["tag"]),
top_k: int = Body(VECTOR_SEARCH_TOP_K, description="匹配向量数"),
score_threshold: float = Body(SCORE_THRESHOLD, description="知识库匹配相关度阈值取值范围在0-1之间SCORE越小相关度越高取到1相当于不筛选建议设置在0.5左右", ge=0, le=1),
stream: bool = Body(False, description="流式输出"),
local_doc_url: bool = Body(False, description="知识文件返回本地路径(true)或URL(false)"),
choose_tools: List[str] = Body([], description="选择tool的集合"),
do_search: bool = Body(False, description="是否进行搜索"),
do_doc_retrieval: bool = Body(False, description="是否进行知识库检索"),
do_code_retrieval: bool = Body(False, description="是否执行代码检索"),
do_tool_retrieval: bool = Body(False, description="是否执行工具检索"),
custom_phase_configs: dict = Body({}, description="自定义phase配置"),
custom_chain_configs: dict = Body({}, description="自定义chain配置"),
custom_role_configs: dict = Body({}, description="自定义role配置"),
history_node_list: List = Body([], description="代码历史相关节点"),
isDetailed: bool = Body(False, description="是否输出完整的agent相关内容"),
upload_file: Union[str, Path, bytes] = "",
**kargs
) -> Message:
# update configs
phase_configs, chain_configs, agent_configs = self.update_configs(
custom_phase_configs, custom_chain_configs, custom_role_configs)
# choose tools
tools = toLangchainTools([TOOL_DICT[i] for i in choose_tools if i in TOOL_DICT])
logger.debug(f"upload_file: {upload_file}")
if upload_file:
upload_file_name = upload_file if upload_file and isinstance(upload_file, str) else upload_file.name
for _filename_idx in range(len(upload_file_name), 0, -1):
if upload_file_name[:_filename_idx] in query:
query = query.replace(upload_file_name[:_filename_idx], upload_file_name)
break
input_message = Message(
role_content=query,
role_type="human",
role_name="user",
input_query=query,
origin_query=query,
phase_name=phase_name,
chain_name=chain_name,
do_search=do_search,
do_doc_retrieval=do_doc_retrieval,
do_code_retrieval=do_code_retrieval,
do_tool_retrieval=do_tool_retrieval,
doc_engine_name=doc_engine_name,
search_engine_name=search_engine_name,
code_engine_name=code_engine_name,
cb_search_type=cb_search_type,
score_threshold=score_threshold, top_k=top_k,
history_node_list=history_node_list,
tools=tools
)
# history memory mangemant
history = Memory(messages=[
Message(role_name=i["role"], role_type=i["role"], role_content=i["content"])
for i in history
])
# start to execute
if phase_configs[input_message.phase_name]["phase_type"] not in self.chatPhase_dict:
phase_class = getattr(PHASE_MODULE, phase_configs[input_message.phase_name]["phase_type"])
phase = phase_class(input_message.phase_name,
task = input_message.task,
phase_config = phase_configs,
chain_config = chain_configs,
role_config = agent_configs,
do_summary=phase_configs[input_message.phase_name]["do_summary"],
do_code_retrieval=input_message.do_code_retrieval,
do_doc_retrieval=input_message.do_doc_retrieval,
do_search=input_message.do_search,
)
self.chatPhase_dict[phase_configs[input_message.phase_name]["phase_type"]] = phase
else:
phase = self.chatPhase_dict[phase_configs[input_message.phase_name]["phase_type"]]
def chat_iterator(message: Message, local_memory: Memory, isDetailed=False):
step_content = local_memory.to_str_messages(content_key='step_content', filter_roles=["user"])
step_content = "\n\n".join([f"{v}" for parsed_output in local_memory.get_parserd_output_list() for k, v in parsed_output.items() if k not in ["Action Status"]])
final_content = message.role_content
result = {
"answer": "",
"db_docs": [str(doc) for doc in message.db_docs],
"search_docs": [str(doc) for doc in message.search_docs],
"code_docs": [str(doc) for doc in message.code_docs],
"related_nodes": [doc.get_related_node() for idx, doc in enumerate(message.code_docs) if idx==0],
"figures": message.figures,
"step_content": step_content or final_content,
"final_content": final_content,
}
related_nodes, has_nodes = [], [ ]
for nodes in result["related_nodes"]:
for node in nodes:
if node not in has_nodes:
related_nodes.append(node)
result["related_nodes"] = related_nodes
# logger.debug(f"{result['figures'].keys()}, isDetailed: {isDetailed}")
message_str = step_content
if self.stream:
for token in message_str:
result["answer"] = token
yield json.dumps(result, ensure_ascii=False)
else:
for token in message_str:
result["answer"] += token
yield json.dumps(result, ensure_ascii=False)
for output_message, local_memory in phase.astep(input_message, history):
# logger.debug(f"output_message: {output_message.role_content}")
# output_message = Message(**output_message)
# local_memory = Memory(**local_memory)
for result in chat_iterator(output_message, local_memory, isDetailed):
yield result
def _chat(self, ):
pass
def update_configs(self, custom_phase_configs, custom_chain_configs, custom_role_configs):
'''update phase/chain/agent configs'''
phase_configs = copy.deepcopy(PHASE_CONFIGS)
phase_configs.update(custom_phase_configs)
chain_configs = copy.deepcopy(CHAIN_CONFIGS)
chain_configs.update(custom_chain_configs)
agent_configs = copy.deepcopy(AGETN_CONFIGS)
agent_configs.update(custom_role_configs)
# phase_configs = load_phase_configs(new_phase_configs)
# chian_configs = load_chain_configs(new_chain_configs)
# agent_configs = load_role_configs(new_agent_configs)
return phase_configs, chain_configs, agent_configs

View File

@ -1,145 +0,0 @@
from fastapi import Body, Request
from fastapi.responses import StreamingResponse
import asyncio, json
from typing import List, AsyncIterable
from langchain import LLMChain
from langchain.callbacks import AsyncIteratorCallbackHandler
from langchain.prompts.chat import ChatPromptTemplate
from dev_opsgpt.llm_models import getChatModel
from dev_opsgpt.chat.utils import History, wrap_done
from configs.model_config import (llm_model_dict, LLM_MODEL, VECTOR_SEARCH_TOP_K, SCORE_THRESHOLD)
from dev_opsgpt.utils import BaseResponse
from loguru import logger
class Chat:
def __init__(
self,
engine_name: str = "",
top_k: int = 1,
stream: bool = False,
) -> None:
self.engine_name = engine_name
self.top_k = top_k
self.stream = stream
def check_service_status(self, ) -> BaseResponse:
return BaseResponse(code=200, msg=f"okok")
def chat(
self,
query: str = Body(..., description="用户输入", examples=["hello"]),
history: List[History] = Body(
[], description="历史对话",
examples=[[{"role": "user", "content": "我们来玩成语接龙,我先来,生龙活虎"}]]
),
engine_name: str = Body(..., description="知识库名称", examples=["samples"]),
top_k: int = Body(VECTOR_SEARCH_TOP_K, description="匹配向量数"),
score_threshold: float = Body(SCORE_THRESHOLD, description="知识库匹配相关度阈值取值范围在0-1之间SCORE越小相关度越高取到1相当于不筛选建议设置在0.5左右", ge=0, le=1),
stream: bool = Body(False, description="流式输出"),
local_doc_url: bool = Body(False, description="知识文件返回本地路径(true)或URL(false)"),
request: Request = None,
**kargs
):
self.engine_name = engine_name if isinstance(engine_name, str) else engine_name.default
self.top_k = top_k if isinstance(top_k, int) else top_k.default
self.score_threshold = score_threshold if isinstance(score_threshold, float) else score_threshold.default
self.stream = stream if isinstance(stream, bool) else stream.default
self.local_doc_url = local_doc_url if isinstance(local_doc_url, bool) else local_doc_url.default
self.request = request
return self._chat(query, history, **kargs)
def _chat(self, query: str, history: List[History], **kargs):
history = [History(**h) if isinstance(h, dict) else h for h in history]
## check service dependcy is ok
service_status = self.check_service_status()
if service_status.code!=200: return service_status
def chat_iterator(query: str, history: List[History]):
model = getChatModel()
result, content = self.create_task(query, history, model, **kargs)
logger.info('result={}'.format(result))
logger.info('content={}'.format(content))
if self.stream:
for token in content["text"]:
result["answer"] = token
yield json.dumps(result, ensure_ascii=False)
else:
for token in content["text"]:
result["answer"] += token
yield json.dumps(result, ensure_ascii=False)
return StreamingResponse(chat_iterator(query, history),
media_type="text/event-stream")
def achat(
self,
query: str = Body(..., description="用户输入", examples=["hello"]),
history: List[History] = Body(
[], description="历史对话",
examples=[[{"role": "user", "content": "我们来玩成语接龙,我先来,生龙活虎"}]]
),
engine_name: str = Body(..., description="知识库名称", examples=["samples"]),
top_k: int = Body(VECTOR_SEARCH_TOP_K, description="匹配向量数"),
score_threshold: float = Body(SCORE_THRESHOLD, description="知识库匹配相关度阈值取值范围在0-1之间SCORE越小相关度越高取到1相当于不筛选建议设置在0.5左右", ge=0, le=1),
stream: bool = Body(False, description="流式输出"),
local_doc_url: bool = Body(False, description="知识文件返回本地路径(true)或URL(false)"),
request: Request = None,
):
self.engine_name = engine_name if isinstance(engine_name, str) else engine_name.default
self.top_k = top_k if isinstance(top_k, int) else top_k.default
self.score_threshold = score_threshold if isinstance(score_threshold, float) else score_threshold.default
self.stream = stream if isinstance(stream, bool) else stream.default
self.local_doc_url = local_doc_url if isinstance(local_doc_url, bool) else local_doc_url.default
self.request = request
return self._achat(query, history)
def _achat(self, query: str, history: List[History]):
history = [History(**h) if isinstance(h, dict) else h for h in history]
## check service dependcy is ok
service_status = self.check_service_status()
if service_status.code!=200: return service_status
async def chat_iterator(query, history):
callback = AsyncIteratorCallbackHandler()
model = getChatModel()
task, result = self.create_atask(query, history, model, callback)
if self.stream:
for token in callback["text"]:
result["answer"] = token
yield json.dumps(result, ensure_ascii=False)
else:
for token in callback["text"]:
result["answer"] += token
yield json.dumps(result, ensure_ascii=False)
await task
return StreamingResponse(chat_iterator(query, history),
media_type="text/event-stream")
def create_task(self, query: str, history: List[History], model, **kargs):
'''构建 llm 生成任务'''
chat_prompt = ChatPromptTemplate.from_messages(
[i.to_msg_tuple() for i in history] + [("human", "{input}")]
)
chain = LLMChain(prompt=chat_prompt, llm=model)
content = chain({"input": query})
return {"answer": "", "docs": ""}, content
def create_atask(self, query, history, model, callback: AsyncIteratorCallbackHandler):
chat_prompt = ChatPromptTemplate.from_messages(
[i.to_msg_tuple() for i in history] + [("human", "{input}")]
)
chain = LLMChain(prompt=chat_prompt, llm=model)
task = asyncio.create_task(wrap_done(
chain.acall({"input": query}), callback.done
))
return task, {"answer": "", "docs": ""}

View File

@ -1,148 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: code_chat.py
@time: 2023/10/24 下午4:04
@desc:
'''
from fastapi import Request, Body
import os, asyncio
from urllib.parse import urlencode
from typing import List
from fastapi.responses import StreamingResponse
from langchain import LLMChain
from langchain.callbacks import AsyncIteratorCallbackHandler
from langchain.prompts.chat import ChatPromptTemplate
from configs.model_config import (
llm_model_dict, LLM_MODEL, PROMPT_TEMPLATE,
VECTOR_SEARCH_TOP_K, SCORE_THRESHOLD, CODE_PROMPT_TEMPLATE)
from dev_opsgpt.chat.utils import History, wrap_done
from dev_opsgpt.utils import BaseResponse
from .base_chat import Chat
from dev_opsgpt.llm_models import getChatModel
from dev_opsgpt.service.kb_api import search_docs, KBServiceFactory
from dev_opsgpt.service.cb_api import search_code, cb_exists_api
from loguru import logger
import json
class CodeChat(Chat):
def __init__(
self,
code_base_name: str = '',
code_limit: int = 1,
stream: bool = False,
request: Request = None,
) -> None:
super().__init__(engine_name=code_base_name, stream=stream)
self.engine_name = code_base_name
self.code_limit = code_limit
self.request = request
self.history_node_list = []
def check_service_status(self) -> BaseResponse:
cb = cb_exists_api(self.engine_name)
if not cb:
return BaseResponse(code=404, msg=f"未找到代码库 {self.engine_name}")
return BaseResponse(code=200, msg=f"找到代码库 {self.engine_name}")
def _process(self, query: str, history: List[History], model):
'''process'''
codes_res = search_code(query=query, cb_name=self.engine_name, code_limit=self.code_limit,
search_type=self.cb_search_type,
history_node_list=self.history_node_list)
context = codes_res['context']
related_vertices = codes_res['related_vertices']
# update node names
# node_names = [node[0] for node in nodes]
# self.history_node_list.extend(node_names)
# self.history_node_list = list(set(self.history_node_list))
source_nodes = []
for inum, node_name in enumerate(related_vertices[0:5]):
source_nodes.append(f'{inum + 1}. 节点名: `{node_name}`')
logger.info('history={}'.format(history))
logger.info('message={}'.format([i.to_msg_tuple() for i in history] + [("human", CODE_PROMPT_TEMPLATE)]))
chat_prompt = ChatPromptTemplate.from_messages(
[i.to_msg_tuple() for i in history] + [("human", CODE_PROMPT_TEMPLATE)]
)
logger.info('chat_prompt={}'.format(chat_prompt))
chain = LLMChain(prompt=chat_prompt, llm=model)
result = {"answer": "", "codes": source_nodes}
return chain, context, result
def chat(
self,
query: str = Body(..., description="用户输入", examples=["hello"]),
history: List[History] = Body(
[], description="历史对话",
examples=[[{"role": "user", "content": "我们来玩成语接龙,我先来,生龙活虎"}]]
),
engine_name: str = Body(..., description="知识库名称", examples=["samples"]),
code_limit: int = Body(1, examples=['1']),
cb_search_type: str = Body('', examples=['1']),
stream: bool = Body(False, description="流式输出"),
local_doc_url: bool = Body(False, description="知识文件返回本地路径(true)或URL(false)"),
request: Request = None,
**kargs
):
self.engine_name = engine_name if isinstance(engine_name, str) else engine_name.default
self.code_limit = code_limit
self.stream = stream if isinstance(stream, bool) else stream.default
self.local_doc_url = local_doc_url if isinstance(local_doc_url, bool) else local_doc_url.default
self.request = request
self.cb_search_type = cb_search_type
return self._chat(query, history, **kargs)
def _chat(self, query: str, history: List[History], **kargs):
history = [History(**h) if isinstance(h, dict) else h for h in history]
service_status = self.check_service_status()
if service_status.code != 200: return service_status
def chat_iterator(query: str, history: List[History]):
model = getChatModel()
result, content = self.create_task(query, history, model, **kargs)
# logger.info('result={}'.format(result))
# logger.info('content={}'.format(content))
if self.stream:
for token in content["text"]:
result["answer"] = token
yield json.dumps(result, ensure_ascii=False)
else:
for token in content["text"]:
result["answer"] += token
yield json.dumps(result, ensure_ascii=False)
return StreamingResponse(chat_iterator(query, history),
media_type="text/event-stream")
def create_task(self, query: str, history: List[History], model):
'''构建 llm 生成任务'''
chain, context, result = self._process(query, history, model)
logger.info('chain={}'.format(chain))
try:
content = chain({"context": context, "question": query})
except Exception as e:
content = {"text": str(e)}
return result, content
def create_atask(self, query, history, model, callback: AsyncIteratorCallbackHandler):
chain, context, result = self._process(query, history, model)
task = asyncio.create_task(wrap_done(
chain.acall({"context": context, "question": query}), callback.done
))
return task, result

View File

@ -1,79 +0,0 @@
from fastapi import Request
import os, asyncio
from urllib.parse import urlencode
from typing import List
from langchain import LLMChain
from langchain.callbacks import AsyncIteratorCallbackHandler
from langchain.prompts.chat import ChatPromptTemplate
from configs.model_config import (
llm_model_dict, LLM_MODEL, PROMPT_TEMPLATE,
VECTOR_SEARCH_TOP_K, SCORE_THRESHOLD)
from dev_opsgpt.chat.utils import History, wrap_done
from dev_opsgpt.utils import BaseResponse
from .base_chat import Chat
from dev_opsgpt.service.kb_api import search_docs, KBServiceFactory
from loguru import logger
class KnowledgeChat(Chat):
def __init__(
self,
engine_name: str = "",
top_k: int = VECTOR_SEARCH_TOP_K,
stream: bool = False,
score_thresold: float = SCORE_THRESHOLD,
local_doc_url: bool = False,
request: Request = None,
) -> None:
super().__init__(engine_name, top_k, stream)
self.score_thresold = score_thresold
self.local_doc_url = local_doc_url
self.request = request
def check_service_status(self) -> BaseResponse:
kb = KBServiceFactory.get_service_by_name(self.engine_name)
if kb is None:
return BaseResponse(code=404, msg=f"未找到知识库 {self.engine_name}")
return BaseResponse(code=200, msg=f"找到知识库 {self.engine_name}")
def _process(self, query: str, history: List[History], model):
'''process'''
docs = search_docs(query, self.engine_name, self.top_k, self.score_threshold)
context = "\n".join([doc.page_content for doc in docs])
source_documents = []
for inum, doc in enumerate(docs):
filename = os.path.split(doc.metadata["source"])[-1]
if self.local_doc_url:
url = "file://" + doc.metadata["source"]
else:
parameters = urlencode({"knowledge_base_name": self.engine_name, "file_name":filename})
url = f"{self.request.base_url}knowledge_base/download_doc?" + parameters
text = f"""出处 [{inum + 1}] [{filename}]({url}) \n\n{doc.page_content}\n\n"""
source_documents.append(text)
chat_prompt = ChatPromptTemplate.from_messages(
[i.to_msg_tuple() for i in history] + [("human", PROMPT_TEMPLATE)]
)
chain = LLMChain(prompt=chat_prompt, llm=model)
result = {"answer": "", "docs": source_documents}
return chain, context, result
def create_task(self, query: str, history: List[History], model):
'''构建 llm 生成任务'''
logger.debug(f"query: {query}, history: {history}")
chain, context, result = self._process(query, history, model)
try:
content = chain({"context": context, "question": query})
except Exception as e:
content = {"text": str(e)}
return result, content
def create_atask(self, query, history, model, callback: AsyncIteratorCallbackHandler):
chain, context, result = self._process(query, history, model)
task = asyncio.create_task(wrap_done(
chain.acall({"context": context, "question": query}), callback.done
))
return task, result

View File

@ -1,41 +0,0 @@
import asyncio
from typing import List
from langchain import LLMChain
from langchain.callbacks import AsyncIteratorCallbackHandler
from langchain.prompts.chat import ChatPromptTemplate
from dev_opsgpt.chat.utils import History, wrap_done
from .base_chat import Chat
from loguru import logger
class LLMChat(Chat):
def __init__(
self,
engine_name: str = "",
top_k: int = 1,
stream: bool = False,
) -> None:
super().__init__(engine_name, top_k, stream)
def create_task(self, query: str, history: List[History], model):
'''构建 llm 生成任务'''
chat_prompt = ChatPromptTemplate.from_messages(
[i.to_msg_tuple() for i in history] + [("human", "{input}")]
)
chain = LLMChain(prompt=chat_prompt, llm=model)
content = chain({"input": query})
return {"answer": "", "docs": ""}, content
def create_atask(self, query, history, model, callback: AsyncIteratorCallbackHandler):
chat_prompt = ChatPromptTemplate.from_messages(
[i.to_msg_tuple() for i in history] + [("human", "{input}")]
)
chain = LLMChain(prompt=chat_prompt, llm=model)
task = asyncio.create_task(wrap_done(
chain.acall({"input": query}), callback.done
))
return task, {"answer": "", "docs": ""}

View File

@ -1,150 +0,0 @@
from fastapi import Request
import os, asyncio
from typing import List, Optional, Dict
from langchain import LLMChain
from langchain.callbacks import AsyncIteratorCallbackHandler
from langchain.utilities import BingSearchAPIWrapper, DuckDuckGoSearchAPIWrapper
from langchain.prompts.chat import ChatPromptTemplate
from langchain.docstore.document import Document
from configs.model_config import (
PROMPT_TEMPLATE, SEARCH_ENGINE_TOP_K, BING_SUBSCRIPTION_KEY, BING_SEARCH_URL,
VECTOR_SEARCH_TOP_K, SCORE_THRESHOLD)
from dev_opsgpt.chat.utils import History, wrap_done
from dev_opsgpt.utils import BaseResponse
from .base_chat import Chat
from loguru import logger
from duckduckgo_search import DDGS
def bing_search(text, result_len=SEARCH_ENGINE_TOP_K):
if not (BING_SEARCH_URL and BING_SUBSCRIPTION_KEY):
return [{"snippet": "please set BING_SUBSCRIPTION_KEY and BING_SEARCH_URL in os ENV",
"title": "env info is not found",
"link": "https://python.langchain.com/en/latest/modules/agents/tools/examples/bing_search.html"}]
search = BingSearchAPIWrapper(bing_subscription_key=BING_SUBSCRIPTION_KEY,
bing_search_url=BING_SEARCH_URL)
return search.results(text, result_len)
def duckduckgo_search(
query: str,
result_len: int = SEARCH_ENGINE_TOP_K,
region: Optional[str] = "wt-wt",
safesearch: str = "moderate",
time: Optional[str] = "y",
backend: str = "api",
):
with DDGS(proxies=os.environ.get("DUCKDUCKGO_PROXY")) as ddgs:
results = ddgs.text(
query,
region=region,
safesearch=safesearch,
timelimit=time,
backend=backend,
)
if results is None:
return [{"Result": "No good DuckDuckGo Search Result was found"}]
def to_metadata(result: Dict) -> Dict[str, str]:
if backend == "news":
return {
"date": result["date"],
"title": result["title"],
"snippet": result["body"],
"source": result["source"],
"link": result["url"],
}
return {
"snippet": result["body"],
"title": result["title"],
"link": result["href"],
}
formatted_results = []
for i, res in enumerate(results, 1):
if res is not None:
formatted_results.append(to_metadata(res))
if len(formatted_results) == result_len:
break
return formatted_results
# def duckduckgo_search(text, result_len=SEARCH_ENGINE_TOP_K):
# search = DuckDuckGoSearchAPIWrapper()
# return search.results(text, result_len)
SEARCH_ENGINES = {"duckduckgo": duckduckgo_search,
"bing": bing_search,
}
def search_result2docs(search_results):
docs = []
for result in search_results:
doc = Document(page_content=result["snippet"] if "snippet" in result.keys() else "",
metadata={"source": result["link"] if "link" in result.keys() else "",
"filename": result["title"] if "title" in result.keys() else ""})
docs.append(doc)
return docs
def lookup_search_engine(
query: str,
search_engine_name: str,
top_k: int = SEARCH_ENGINE_TOP_K,
):
results = SEARCH_ENGINES[search_engine_name](query, result_len=top_k)
docs = search_result2docs(results)
return docs
class SearchChat(Chat):
def __init__(
self,
engine_name: str = "",
top_k: int = VECTOR_SEARCH_TOP_K,
stream: bool = False,
) -> None:
super().__init__(engine_name, top_k, stream)
def check_service_status(self) -> BaseResponse:
if self.engine_name not in SEARCH_ENGINES.keys():
return BaseResponse(code=404, msg=f"未支持搜索引擎 {self.engine_name}")
return BaseResponse(code=200, msg=f"支持搜索引擎 {self.engine_name}")
def _process(self, query: str, history: List[History], model):
'''process'''
docs = lookup_search_engine(query, self.engine_name, self.top_k)
context = "\n".join([doc.page_content for doc in docs])
source_documents = [
f"""出处 [{inum + 1}] [{doc.metadata["source"]}]({doc.metadata["source"]}) \n\n{doc.page_content}\n\n"""
for inum, doc in enumerate(docs)
]
chat_prompt = ChatPromptTemplate.from_messages(
[i.to_msg_tuple() for i in history] + [("human", PROMPT_TEMPLATE)]
)
chain = LLMChain(prompt=chat_prompt, llm=model)
result = {"answer": "", "docs": source_documents}
return chain, context, result
def create_task(self, query: str, history: List[History], model):
'''构建 llm 生成任务'''
chain, context, result = self._process(query, history, model)
content = chain({"context": context, "question": query})
return result, content
def create_atask(self, query, history, model, callback: AsyncIteratorCallbackHandler):
chain, context, result = self._process(query, history, model)
task = asyncio.create_task(wrap_done(
chain.acall({"context": context, "question": query}), callback.done
))
return task, result

View File

@ -1,30 +0,0 @@
import asyncio
from typing import Awaitable
from pydantic import BaseModel, Field
async def wrap_done(fn: Awaitable, event: asyncio.Event):
"""Wrap an awaitable with a event to signal when it's done or an exception is raised."""
try:
await fn
except Exception as e:
# TODO: handle exception
print(f"Caught exception: {e}")
finally:
# Signal the aiter to stop.
event.set()
class History(BaseModel):
"""
对话历史
可从dict生成
h = History(**{"role":"user","content":"你好"})
也可转换为tuple
h.to_msy_tuple = ("human", "你好")
"""
role: str = Field(...)
content: str = Field(...)
def to_msg_tuple(self):
return "ai" if self.role=="assistant" else "human", self.content

View File

@ -1,7 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: __init__.py.py
@time: 2023/11/21 下午2:01
@desc:
'''

View File

@ -1,7 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: __init__.py.py
@time: 2023/11/21 下午2:27
@desc:
'''

View File

@ -1,219 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: code_analyzer.py
@time: 2023/11/21 下午2:27
@desc:
'''
import time
from loguru import logger
from dev_opsgpt.codechat.code_analyzer.code_static_analysis import CodeStaticAnalysis
from dev_opsgpt.codechat.code_analyzer.code_intepreter import CodeIntepreter
from dev_opsgpt.codechat.code_analyzer.code_preprocess import CodePreprocessor
from dev_opsgpt.codechat.code_analyzer.code_dedup import CodeDedup
class CodeAnalyzer:
def __init__(self, language: str):
self.code_preprocessor = CodePreprocessor()
self.code_debup = CodeDedup()
self.code_interperter = CodeIntepreter()
self.code_static_analyzer = CodeStaticAnalysis(language=language)
def analyze(self, code_dict: dict, do_interpret: bool = True):
'''
analyze code
@param code_dict: {fp: code_text}
@param do_interpret: Whether to get analysis result
@return:
'''
# preprocess and dedup
st = time.time()
code_dict = self.code_preprocessor.preprocess(code_dict)
code_dict = self.code_debup.dedup(code_dict)
logger.debug('preprocess and dedup rt={}'.format(time.time() - st))
# static analysis
st = time.time()
static_analysis_res = self.code_static_analyzer.analyze(code_dict)
logger.debug('static analysis rt={}'.format(time.time() - st))
# interpretation
if do_interpret:
logger.info('start interpret code')
st = time.time()
code_list = list(code_dict.values())
interpretation = self.code_interperter.get_intepretation_batch(code_list)
logger.debug('interpret rt={}'.format(time.time() - st))
else:
interpretation = {i: '' for i in code_dict.values()}
return static_analysis_res, interpretation
if __name__ == '__main__':
engine = 'openai'
language = 'java'
code_dict = {'1': '''package com.theokanning.openai.client;
import com.theokanning.openai.DeleteResult;
import com.theokanning.openai.OpenAiResponse;
import com.theokanning.openai.audio.TranscriptionResult;
import com.theokanning.openai.audio.TranslationResult;
import com.theokanning.openai.billing.BillingUsage;
import com.theokanning.openai.billing.Subscription;
import com.theokanning.openai.completion.CompletionRequest;
import com.theokanning.openai.completion.CompletionResult;
import com.theokanning.openai.completion.chat.ChatCompletionRequest;
import com.theokanning.openai.completion.chat.ChatCompletionResult;
import com.theokanning.openai.edit.EditRequest;
import com.theokanning.openai.edit.EditResult;
import com.theokanning.openai.embedding.EmbeddingRequest;
import com.theokanning.openai.embedding.EmbeddingResult;
import com.theokanning.openai.engine.Engine;
import com.theokanning.openai.file.File;
import com.theokanning.openai.fine_tuning.FineTuningEvent;
import com.theokanning.openai.fine_tuning.FineTuningJob;
import com.theokanning.openai.fine_tuning.FineTuningJobRequest;
import com.theokanning.openai.finetune.FineTuneEvent;
import com.theokanning.openai.finetune.FineTuneRequest;
import com.theokanning.openai.finetune.FineTuneResult;
import com.theokanning.openai.image.CreateImageRequest;
import com.theokanning.openai.image.ImageResult;
import com.theokanning.openai.model.Model;
import com.theokanning.openai.moderation.ModerationRequest;
import com.theokanning.openai.moderation.ModerationResult;
import io.reactivex.Single;
import okhttp3.MultipartBody;
import okhttp3.RequestBody;
import okhttp3.ResponseBody;
import retrofit2.Call;
import retrofit2.http.*;
import java.time.LocalDate;
public interface OpenAiApi {
@GET("v1/models")
Single<OpenAiResponse<Model>> listModels();
@GET("/v1/models/{model_id}")
Single<Model> getModel(@Path("model_id") String modelId);
@POST("/v1/completions")
Single<CompletionResult> createCompletion(@Body CompletionRequest request);
@Streaming
@POST("/v1/completions")
Call<ResponseBody> createCompletionStream(@Body CompletionRequest request);
@POST("/v1/chat/completions")
Single<ChatCompletionResult> createChatCompletion(@Body ChatCompletionRequest request);
@Streaming
@POST("/v1/chat/completions")
Call<ResponseBody> createChatCompletionStream(@Body ChatCompletionRequest request);
@Deprecated
@POST("/v1/engines/{engine_id}/completions")
Single<CompletionResult> createCompletion(@Path("engine_id") String engineId, @Body CompletionRequest request);
@POST("/v1/edits")
Single<EditResult> createEdit(@Body EditRequest request);
@Deprecated
@POST("/v1/engines/{engine_id}/edits")
Single<EditResult> createEdit(@Path("engine_id") String engineId, @Body EditRequest request);
@POST("/v1/embeddings")
Single<EmbeddingResult> createEmbeddings(@Body EmbeddingRequest request);
@Deprecated
@POST("/v1/engines/{engine_id}/embeddings")
Single<EmbeddingResult> createEmbeddings(@Path("engine_id") String engineId, @Body EmbeddingRequest request);
@GET("/v1/files")
Single<OpenAiResponse<File>> listFiles();
@Multipart
@POST("/v1/files")
Single<File> uploadFile(@Part("purpose") RequestBody purpose, @Part MultipartBody.Part file);
@DELETE("/v1/files/{file_id}")
Single<DeleteResult> deleteFile(@Path("file_id") String fileId);
@GET("/v1/files/{file_id}")
Single<File> retrieveFile(@Path("file_id") String fileId);
@Streaming
@GET("/v1/files/{file_id}/content")
Single<ResponseBody> retrieveFileContent(@Path("file_id") String fileId);
@POST("/v1/fine_tuning/jobs")
Single<FineTuningJob> createFineTuningJob(@Body FineTuningJobRequest request);
@GET("/v1/fine_tuning/jobs")
Single<OpenAiResponse<FineTuningJob>> listFineTuningJobs();
@GET("/v1/fine_tuning/jobs/{fine_tuning_job_id}")
Single<FineTuningJob> retrieveFineTuningJob(@Path("fine_tuning_job_id") String fineTuningJobId);
@POST("/v1/fine_tuning/jobs/{fine_tuning_job_id}/cancel")
Single<FineTuningJob> cancelFineTuningJob(@Path("fine_tuning_job_id") String fineTuningJobId);
@GET("/v1/fine_tuning/jobs/{fine_tuning_job_id}/events")
Single<OpenAiResponse<FineTuningEvent>> listFineTuningJobEvents(@Path("fine_tuning_job_id") String fineTuningJobId);
@Deprecated
@POST("/v1/fine-tunes")
Single<FineTuneResult> createFineTune(@Body FineTuneRequest request);
@POST("/v1/completions")
Single<CompletionResult> createFineTuneCompletion(@Body CompletionRequest request);
@Deprecated
@GET("/v1/fine-tunes")
Single<OpenAiResponse<FineTuneResult>> listFineTunes();
@Deprecated
@GET("/v1/fine-tunes/{fine_tune_id}")
Single<FineTuneResult> retrieveFineTune(@Path("fine_tune_id") String fineTuneId);
@Deprecated
@POST("/v1/fine-tunes/{fine_tune_id}/cancel")
Single<FineTuneResult> cancelFineTune(@Path("fine_tune_id") String fineTuneId);
@Deprecated
@GET("/v1/fine-tunes/{fine_tune_id}/events")
Single<OpenAiResponse<FineTuneEvent>> listFineTuneEvents(@Path("fine_tune_id") String fineTuneId);
@DELETE("/v1/models/{fine_tune_id}")
Single<DeleteResult> deleteFineTune(@Path("fine_tune_id") String fineTuneId);
@POST("/v1/images/generations")
Single<ImageResult> createImage(@Body CreateImageRequest request);
@POST("/v1/images/edits")
Single<ImageResult> createImageEdit(@Body RequestBody requestBody);
@POST("/v1/images/variations")
Single<ImageResult> createImageVariation(@Body RequestBody requestBody);
@POST("/v1/audio/transcriptions")
Single<TranscriptionResult> createTranscription(@Body RequestBody requestBody);
@POST("/v1/audio/translations")
Single<TranslationResult> createTranslation(@Body RequestBody requestBody);
@POST("/v1/moderations")
Single<ModerationResult> createModeration(@Body ModerationRequest request);
@Deprecated
@GET("v1/engines")
Single<OpenAiResponse<Engine>> getEngines();
@Deprecated
@GET("/v1/engines/{engine_id}")
Single<Engine> getEngine(@Path("engine_id") String engineId);
/**
* Account information inquiry: It contains total amount (in US dollars) and other information.
*
* @return
*/
@Deprecated
@GET("v1/dashboard/billing/subscription")
Single<Subscription> subscription();
/**
* Account call interface consumption amount inquiry.
* totalUsage = Total amount used by the account (in US cents).
*
* @param starDate
* @param endDate
* @return Consumption amount information.
*/
@Deprecated
@GET("v1/dashboard/billing/usage")
Single<BillingUsage> billingUsage(@Query("start_date") LocalDate starDate, @Query("end_date") LocalDate endDate);
}''', '2': '''
package com.theokanning.openai;
/**
* OkHttp Interceptor that adds an authorization token header
*
* @deprecated Use {@link com.theokanning.openai.client.AuthenticationInterceptor}
*/
@Deprecated
public class AuthenticationInterceptor extends com.theokanning.openai.client.AuthenticationInterceptor {
AuthenticationInterceptor(String token) {
super(token);
}
}
'''}
ca = CodeAnalyzer(engine, language)
res = ca.analyze(code_dict)
logger.debug(res)

View File

@ -1,31 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: code_dedup.py
@time: 2023/11/21 下午2:27
@desc:
'''
# encoding: utf-8
'''
@author: 温进
@file: java_dedup.py
@time: 2023/10/23 下午5:02
@desc:
'''
class CodeDedup:
def __init__(self):
pass
def dedup(self, code_dict):
code_dict = self.exact_dedup(code_dict)
return code_dict
def exact_dedup(self, code_dict):
res = {}
for fp, code_text in code_dict.items():
if code_text not in res.values():
res[fp] = code_text
return res

View File

@ -1,229 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: code_intepreter.py
@time: 2023/11/22 上午11:57
@desc:
'''
from loguru import logger
from langchain.schema import (
HumanMessage,
)
from configs.model_config import CODE_INTERPERT_TEMPLATE
from dev_opsgpt.llm_models.openai_model import getChatModel
class CodeIntepreter:
def __init__(self):
pass
def get_intepretation(self, code_list):
'''
get intepretion of code
@param code_list:
@return:
'''
chat_model = getChatModel()
res = {}
for code in code_list:
message = CODE_INTERPERT_TEMPLATE.format(code=code)
message = [HumanMessage(content=message)]
chat_res = chat_model.predict_messages(message)
content = chat_res.content
res[code] = content
return res
def get_intepretation_batch(self, code_list):
'''
get intepretion of code
@param code_list:
@return:
'''
chat_model = getChatModel()
res = {}
messages = []
for code in code_list:
message = CODE_INTERPERT_TEMPLATE.format(code=code)
messages.append(message)
chat_ress = chat_model.batch(messages)
for chat_res, code in zip(chat_ress, code_list):
res[code] = chat_res.content
return res
if __name__ == '__main__':
engine = 'openai'
code_list = ['''package com.theokanning.openai.client;
import com.theokanning.openai.DeleteResult;
import com.theokanning.openai.OpenAiResponse;
import com.theokanning.openai.audio.TranscriptionResult;
import com.theokanning.openai.audio.TranslationResult;
import com.theokanning.openai.billing.BillingUsage;
import com.theokanning.openai.billing.Subscription;
import com.theokanning.openai.completion.CompletionRequest;
import com.theokanning.openai.completion.CompletionResult;
import com.theokanning.openai.completion.chat.ChatCompletionRequest;
import com.theokanning.openai.completion.chat.ChatCompletionResult;
import com.theokanning.openai.edit.EditRequest;
import com.theokanning.openai.edit.EditResult;
import com.theokanning.openai.embedding.EmbeddingRequest;
import com.theokanning.openai.embedding.EmbeddingResult;
import com.theokanning.openai.engine.Engine;
import com.theokanning.openai.file.File;
import com.theokanning.openai.fine_tuning.FineTuningEvent;
import com.theokanning.openai.fine_tuning.FineTuningJob;
import com.theokanning.openai.fine_tuning.FineTuningJobRequest;
import com.theokanning.openai.finetune.FineTuneEvent;
import com.theokanning.openai.finetune.FineTuneRequest;
import com.theokanning.openai.finetune.FineTuneResult;
import com.theokanning.openai.image.CreateImageRequest;
import com.theokanning.openai.image.ImageResult;
import com.theokanning.openai.model.Model;
import com.theokanning.openai.moderation.ModerationRequest;
import com.theokanning.openai.moderation.ModerationResult;
import io.reactivex.Single;
import okhttp3.MultipartBody;
import okhttp3.RequestBody;
import okhttp3.ResponseBody;
import retrofit2.Call;
import retrofit2.http.*;
import java.time.LocalDate;
public interface OpenAiApi {
@GET("v1/models")
Single<OpenAiResponse<Model>> listModels();
@GET("/v1/models/{model_id}")
Single<Model> getModel(@Path("model_id") String modelId);
@POST("/v1/completions")
Single<CompletionResult> createCompletion(@Body CompletionRequest request);
@Streaming
@POST("/v1/completions")
Call<ResponseBody> createCompletionStream(@Body CompletionRequest request);
@POST("/v1/chat/completions")
Single<ChatCompletionResult> createChatCompletion(@Body ChatCompletionRequest request);
@Streaming
@POST("/v1/chat/completions")
Call<ResponseBody> createChatCompletionStream(@Body ChatCompletionRequest request);
@Deprecated
@POST("/v1/engines/{engine_id}/completions")
Single<CompletionResult> createCompletion(@Path("engine_id") String engineId, @Body CompletionRequest request);
@POST("/v1/edits")
Single<EditResult> createEdit(@Body EditRequest request);
@Deprecated
@POST("/v1/engines/{engine_id}/edits")
Single<EditResult> createEdit(@Path("engine_id") String engineId, @Body EditRequest request);
@POST("/v1/embeddings")
Single<EmbeddingResult> createEmbeddings(@Body EmbeddingRequest request);
@Deprecated
@POST("/v1/engines/{engine_id}/embeddings")
Single<EmbeddingResult> createEmbeddings(@Path("engine_id") String engineId, @Body EmbeddingRequest request);
@GET("/v1/files")
Single<OpenAiResponse<File>> listFiles();
@Multipart
@POST("/v1/files")
Single<File> uploadFile(@Part("purpose") RequestBody purpose, @Part MultipartBody.Part file);
@DELETE("/v1/files/{file_id}")
Single<DeleteResult> deleteFile(@Path("file_id") String fileId);
@GET("/v1/files/{file_id}")
Single<File> retrieveFile(@Path("file_id") String fileId);
@Streaming
@GET("/v1/files/{file_id}/content")
Single<ResponseBody> retrieveFileContent(@Path("file_id") String fileId);
@POST("/v1/fine_tuning/jobs")
Single<FineTuningJob> createFineTuningJob(@Body FineTuningJobRequest request);
@GET("/v1/fine_tuning/jobs")
Single<OpenAiResponse<FineTuningJob>> listFineTuningJobs();
@GET("/v1/fine_tuning/jobs/{fine_tuning_job_id}")
Single<FineTuningJob> retrieveFineTuningJob(@Path("fine_tuning_job_id") String fineTuningJobId);
@POST("/v1/fine_tuning/jobs/{fine_tuning_job_id}/cancel")
Single<FineTuningJob> cancelFineTuningJob(@Path("fine_tuning_job_id") String fineTuningJobId);
@GET("/v1/fine_tuning/jobs/{fine_tuning_job_id}/events")
Single<OpenAiResponse<FineTuningEvent>> listFineTuningJobEvents(@Path("fine_tuning_job_id") String fineTuningJobId);
@Deprecated
@POST("/v1/fine-tunes")
Single<FineTuneResult> createFineTune(@Body FineTuneRequest request);
@POST("/v1/completions")
Single<CompletionResult> createFineTuneCompletion(@Body CompletionRequest request);
@Deprecated
@GET("/v1/fine-tunes")
Single<OpenAiResponse<FineTuneResult>> listFineTunes();
@Deprecated
@GET("/v1/fine-tunes/{fine_tune_id}")
Single<FineTuneResult> retrieveFineTune(@Path("fine_tune_id") String fineTuneId);
@Deprecated
@POST("/v1/fine-tunes/{fine_tune_id}/cancel")
Single<FineTuneResult> cancelFineTune(@Path("fine_tune_id") String fineTuneId);
@Deprecated
@GET("/v1/fine-tunes/{fine_tune_id}/events")
Single<OpenAiResponse<FineTuneEvent>> listFineTuneEvents(@Path("fine_tune_id") String fineTuneId);
@DELETE("/v1/models/{fine_tune_id}")
Single<DeleteResult> deleteFineTune(@Path("fine_tune_id") String fineTuneId);
@POST("/v1/images/generations")
Single<ImageResult> createImage(@Body CreateImageRequest request);
@POST("/v1/images/edits")
Single<ImageResult> createImageEdit(@Body RequestBody requestBody);
@POST("/v1/images/variations")
Single<ImageResult> createImageVariation(@Body RequestBody requestBody);
@POST("/v1/audio/transcriptions")
Single<TranscriptionResult> createTranscription(@Body RequestBody requestBody);
@POST("/v1/audio/translations")
Single<TranslationResult> createTranslation(@Body RequestBody requestBody);
@POST("/v1/moderations")
Single<ModerationResult> createModeration(@Body ModerationRequest request);
@Deprecated
@GET("v1/engines")
Single<OpenAiResponse<Engine>> getEngines();
@Deprecated
@GET("/v1/engines/{engine_id}")
Single<Engine> getEngine(@Path("engine_id") String engineId);
/**
* Account information inquiry: It contains total amount (in US dollars) and other information.
*
* @return
*/
@Deprecated
@GET("v1/dashboard/billing/subscription")
Single<Subscription> subscription();
/**
* Account call interface consumption amount inquiry.
* totalUsage = Total amount used by the account (in US cents).
*
* @param starDate
* @param endDate
* @return Consumption amount information.
*/
@Deprecated
@GET("v1/dashboard/billing/usage")
Single<BillingUsage> billingUsage(@Query("start_date") LocalDate starDate, @Query("end_date") LocalDate endDate);
}''', '''
package com.theokanning.openai;
/**
* OkHttp Interceptor that adds an authorization token header
*
* @deprecated Use {@link com.theokanning.openai.client.AuthenticationInterceptor}
*/
@Deprecated
public class AuthenticationInterceptor extends com.theokanning.openai.client.AuthenticationInterceptor {
AuthenticationInterceptor(String token) {
super(token);
}
}
''']
ci = CodeIntepreter(engine)
res = ci.get_intepretation_batch(code_list)
logger.debug(res)

View File

@ -1,14 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: code_preprocess.py
@time: 2023/11/21 下午2:28
@desc:
'''
class CodePreprocessor:
def __init__(self):
pass
def preprocess(self, code_dict):
return code_dict

View File

@ -1,26 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: code_static_analysis.py
@time: 2023/11/21 下午2:28
@desc:
'''
from dev_opsgpt.codechat.code_analyzer.language_static_analysis import *
class CodeStaticAnalysis:
def __init__(self, language):
self.language = language
def analyze(self, code_dict):
'''
analyze code
@param code_list:
@return:
'''
if self.language == 'java':
analyzer = JavaStaticAnalysis()
else:
raise ValueError('language should be one of [java]')
analyze_res = analyzer.analyze(code_dict)
return analyze_res

View File

@ -1,14 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: __init__.py.py
@time: 2023/11/21 下午4:24
@desc:
'''
from .java_static_analysis import JavaStaticAnalysis
__all__ = [
'JavaStaticAnalysis'
]

View File

@ -1,116 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: java_static_analysis.py
@time: 2023/11/21 下午4:25
@desc:
'''
import os
from loguru import logger
import javalang
class JavaStaticAnalysis:
def __init__(self):
pass
def analyze(self, java_code_dict):
'''
parse java code and extract entity
'''
tree_dict = self.preparse(java_code_dict)
res = self.multi_java_code_parse(tree_dict)
return res
def preparse(self, java_code_dict):
'''
preparse by javalang
< dict of java_code and tree
'''
tree_dict = {}
for fp, java_code in java_code_dict.items():
try:
tree = javalang.parse.parse(java_code)
except Exception as e:
continue
if tree.package is not None:
tree_dict[fp] = {'code': java_code, 'tree': tree}
logger.info('success parse {} files'.format(len(tree_dict)))
return tree_dict
def single_java_code_parse(self, tree, fp):
'''
parse single code file
> tree: javalang parse result
< {pac_name: '', class_name_list: [], func_name_dict: {}, import_pac_name_list: []]}
'''
import_pac_name_list = []
# get imports
import_list = tree.imports
for import_pac in import_list:
import_pac_name = import_pac.path
import_pac_name_list.append(import_pac_name)
fp_last = fp.split(os.path.sep)[-1]
pac_name = tree.package.name + '#' + fp_last
class_name_list = []
func_name_dict = {}
for node in tree.types:
if type(node) in (javalang.tree.ClassDeclaration, javalang.tree.InterfaceDeclaration):
class_name = pac_name + '#' + node.name
class_name_list.append(class_name)
for node_inner in node.body:
if type(node_inner) is javalang.tree.MethodDeclaration:
func_name = class_name + '#' + node_inner.name
# add params name to func_name
params_list = node_inner.parameters
for params in params_list:
params_name = params.type.name
func_name = func_name + '-' + params_name
if class_name not in func_name_dict:
func_name_dict[class_name] = []
func_name_dict[class_name].append(func_name)
res = {
'pac_name': pac_name,
'class_name_list': class_name_list,
'func_name_dict': func_name_dict,
'import_pac_name_list': import_pac_name_list
}
return res
def multi_java_code_parse(self, tree_dict):
'''
parse multiple java code
> tree_list
< parse_result_dict
'''
res_dict = {}
for fp, value in tree_dict.items():
java_code = value['code']
tree = value['tree']
try:
res_dict[java_code] = self.single_java_code_parse(tree, fp)
except Exception as e:
logger.debug(java_code)
raise ImportError
return res_dict

View File

@ -1,15 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: __init__.py.py
@time: 2023/11/21 下午2:02
@desc:
'''
from .zip_crawler import ZipCrawler
from .dir_crawler import DirCrawler
__all__ = [
'ZipCrawler',
'DirCrawler'
]

View File

@ -1,39 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: dir_crawler.py
@time: 2023/11/22 下午2:54
@desc:
'''
from loguru import logger
import os
import glob
class DirCrawler:
@staticmethod
def crawl(path: str, suffix: str):
'''
read local java file in path
> path: path to crawl, must be absolute path like A/B/C
< dict of java code string
'''
java_file_list = glob.glob('{path}{sep}**{sep}*.{suffix}'.format(path=path, sep=os.path.sep, suffix=suffix),
recursive=True)
java_code_dict = {}
logger.info(path)
logger.info('number of file={}'.format(len(java_file_list)))
logger.info(java_file_list)
for java_file in java_file_list:
with open(java_file) as f:
java_code = ''.join(f.readlines())
java_code_dict[java_file] = java_code
return java_code_dict
if __name__ == '__main__':
path = '/Users/bingxu/Desktop/工作/大模型/chatbot/test_code_repo/middleware-alipay-starters-parent'
suffix = 'java'
DirCrawler.crawl(path, suffix)

View File

@ -1,31 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: zip_crawler.py
@time: 2023/11/21 下午2:02
@desc:
'''
from loguru import logger
import zipfile
from dev_opsgpt.codechat.code_crawler.dir_crawler import DirCrawler
class ZipCrawler:
@staticmethod
def crawl(zip_file, output_path, suffix):
'''
unzip to output_path
@param zip_file:
@param output_path:
@return:
'''
logger.info(f'output_path={output_path}')
print(f'output_path={output_path}')
with zipfile.ZipFile(zip_file, 'r') as z:
z.extractall(output_path)
code_dict = DirCrawler.crawl(output_path, suffix)
return code_dict

View File

@ -1,7 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: __init__.py.py
@time: 2023/11/21 下午2:35
@desc:
'''

View File

@ -1,179 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: code_search.py
@time: 2023/11/21 下午2:35
@desc:
'''
import time
from loguru import logger
from collections import defaultdict
from dev_opsgpt.db_handler.graph_db_handler.nebula_handler import NebulaHandler
from dev_opsgpt.db_handler.vector_db_handler.chroma_handler import ChromaHandler
from dev_opsgpt.codechat.code_search.cypher_generator import CypherGenerator
from dev_opsgpt.codechat.code_search.tagger import Tagger
from dev_opsgpt.embeddings.get_embedding import get_embedding
# search_by_tag
VERTEX_SCORE = 10
HISTORY_VERTEX_SCORE = 5
VERTEX_MERGE_RATIO = 0.5
# search_by_description
MAX_DISTANCE = 1000
class CodeSearch:
def __init__(self, nh: NebulaHandler, ch: ChromaHandler, limit: int = 3):
'''
init
@param nh: NebulaHandler
@param ch: ChromaHandler
@param limit: limit of result
'''
self.nh = nh
self.ch = ch
self.limit = limit
def search_by_tag(self, query: str):
'''
search_code_res by tag
@param query: str
@return:
'''
tagger = Tagger()
tag_list = tagger.generate_tag_query(query)
logger.info(f'query tag={tag_list}')
# get all verticex
vertex_list = self.nh.get_vertices().get('v', [])
vertex_vid_list = [i.as_node().get_id().as_string() for i in vertex_list]
logger.debug(vertex_vid_list)
# update score
vertex_score_dict = defaultdict(lambda: 0)
for vid in vertex_vid_list:
for tag in tag_list:
if tag in vid:
vertex_score_dict[vid] += VERTEX_SCORE
# merge depend adj score
vertex_score_dict_final = {}
for vertex in vertex_score_dict:
cypher = f'''MATCH (v1)-[e]-(v2) where id(v1) == "{vertex}" RETURN v2'''
cypher_res = self.nh.execute_cypher(cypher, self.nh.space_name)
cypher_res_dict = self.nh.result_to_dict(cypher_res)
adj_vertex_list = [i.as_node().get_id().as_string() for i in cypher_res_dict.get('v2', [])]
score = vertex_score_dict.get(vertex, 0)
for adj_vertex in adj_vertex_list:
score += vertex_score_dict.get(adj_vertex, 0) * VERTEX_MERGE_RATIO
if score > 0:
vertex_score_dict_final[vertex] = score
# get most prominent package tag
package_score_dict = defaultdict(lambda: 0)
for vertex, score in vertex_score_dict.items():
package = '#'.join(vertex.split('#')[0:2])
package_score_dict[package] += score
# get respective code
res = []
package_score_tuple = list(package_score_dict.items())
package_score_tuple.sort(key=lambda x: x[1], reverse=True)
ids = [i[0] for i in package_score_tuple]
chroma_res = self.ch.get(ids=ids, include=['metadatas'])
for vertex, score in package_score_tuple:
index = chroma_res['result']['ids'].index(vertex)
code_text = chroma_res['result']['metadatas'][index]['code_text']
res.append({
"vertex": vertex,
"code_text": code_text}
)
if len(res) >= self.limit:
break
return res
def search_by_desciption(self, query: str, engine: str):
'''
search by perform sim search
@param query:
@return:
'''
query = query.replace(',', '')
query_emb = get_embedding(engine=engine, text_list=[query])
query_emb = query_emb[query]
query_embeddings = [query_emb]
query_result = self.ch.query(query_embeddings=query_embeddings, n_results=self.limit,
include=['metadatas', 'distances'])
logger.debug(query_result)
res = []
for idx, distance in enumerate(query_result['result']['distances'][0]):
if distance < MAX_DISTANCE:
vertex = query_result['result']['ids'][0][idx]
code_text = query_result['result']['metadatas'][0][idx]['code_text']
res.append({
"vertex": vertex,
"code_text": code_text
})
return res
def search_by_cypher(self, query: str):
'''
search by generating cypher
@param query:
@param engine:
@return:
'''
cg = CypherGenerator()
cypher = cg.get_cypher(query)
if not cypher:
return None
cypher_res = self.nh.execute_cypher(cypher, self.nh.space_name)
logger.info(f'cypher execution result={cypher_res}')
if not cypher_res.is_succeeded():
return {
'cypher': '',
'cypher_res': ''
}
res = {
'cypher': cypher,
'cypher_res': cypher_res
}
return res
if __name__ == '__main__':
from configs.server_config import NEBULA_HOST, NEBULA_PORT, NEBULA_USER, NEBULA_PASSWORD, NEBULA_STORAGED_PORT
from configs.server_config import CHROMA_PERSISTENT_PATH
codebase_name = 'testing'
nh = NebulaHandler(host=NEBULA_HOST, port=NEBULA_PORT, username=NEBULA_USER,
password=NEBULA_PASSWORD, space_name=codebase_name)
nh.add_host(NEBULA_HOST, NEBULA_STORAGED_PORT)
time.sleep(0.5)
ch = ChromaHandler(path=CHROMA_PERSISTENT_PATH, collection_name=codebase_name)
cs = CodeSearch(nh, ch)
# res = cs.search_by_tag(tag_list=['createFineTuneCompletion', 'OpenAiApi'])
# logger.debug(res)
# res = cs.search_by_cypher('代码中一共有多少个类', 'openai')
# logger.debug(res)
res = cs.search_by_desciption('使用不同的HTTP请求类型GET、POST、DELETE等来执行不同的操作', 'openai')
logger.debug(res)

View File

@ -1,63 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: cypher_generator.py
@time: 2023/11/24 上午10:17
@desc:
'''
from loguru import logger
from dev_opsgpt.llm_models.openai_model import getChatModel
from dev_opsgpt.utils.postprocess import replace_lt_gt
from langchain.schema import (
HumanMessage,
)
from langchain.chains.graph_qa.prompts import NGQL_GENERATION_PROMPT
schema = '''
Node properties: [{'tag': 'package', 'properties': []}, {'tag': 'class', 'properties': []}, {'tag': 'method', 'properties': []}]
Edge properties: [{'edge': 'contain', 'properties': []}, {'edge': 'depend', 'properties': []}]
Relationships: ['(:package)-[:contain]->(:class)', '(:class)-[:contain]->(:method)', '(:package)-[:contain]->(:package)']
'''
class CypherGenerator:
def __init__(self):
self.model = getChatModel()
def get_cypher(self, query: str):
'''
get cypher from query
@param query:
@return:
'''
content = NGQL_GENERATION_PROMPT.format(schema=schema, question=query)
ans = ''
message = [HumanMessage(content=content)]
chat_res = self.model.predict_messages(message)
ans = chat_res.content
ans = replace_lt_gt(ans)
ans = self.post_process(ans)
return ans
def post_process(self, cypher_res: str):
'''
判断是否为正确的 cypher
@param cypher_res:
@return:
'''
if '(' not in cypher_res or ')' not in cypher_res:
return ''
return cypher_res
if __name__ == '__main__':
query = '代码中一共有多少个类'
cg = CypherGenerator(engine='openai')
cg.get_cypher(query)

View File

@ -1,23 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: tagger.py
@time: 2023/11/24 下午1:32
@desc:
'''
import re
from loguru import logger
class Tagger:
def __init__(self):
pass
def generate_tag_query(self, query):
'''
generate tag from query
'''
# simple extract english
tag_list = re.findall(r'[a-zA-Z\_\.]+', query)
tag_list = list(set(tag_list))
return tag_list

View File

@ -1,7 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: __init__.py.py
@time: 2023/11/21 下午2:07
@desc:
'''

File diff suppressed because one or more lines are too long

View File

@ -1,169 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: codebase_handler.py
@time: 2023/11/21 下午2:25
@desc:
'''
import time
from loguru import logger
from configs.server_config import NEBULA_HOST, NEBULA_PORT, NEBULA_USER, NEBULA_PASSWORD, NEBULA_STORAGED_PORT
from configs.server_config import CHROMA_PERSISTENT_PATH
from configs.model_config import EMBEDDING_ENGINE
from dev_opsgpt.db_handler.graph_db_handler.nebula_handler import NebulaHandler
from dev_opsgpt.db_handler.vector_db_handler.chroma_handler import ChromaHandler
from dev_opsgpt.codechat.code_crawler.zip_crawler import *
from dev_opsgpt.codechat.code_analyzer.code_analyzer import CodeAnalyzer
from dev_opsgpt.codechat.codebase_handler.code_importer import CodeImporter
from dev_opsgpt.codechat.code_search.code_search import CodeSearch
class CodeBaseHandler:
def __init__(self, codebase_name: str, code_path: str = '',
language: str = 'java', crawl_type: str = 'ZIP'):
self.codebase_name = codebase_name
self.code_path = code_path
self.language = language
self.crawl_type = crawl_type
self.nh = NebulaHandler(host=NEBULA_HOST, port=NEBULA_PORT, username=NEBULA_USER,
password=NEBULA_PASSWORD, space_name=codebase_name)
self.nh.add_host(NEBULA_HOST, NEBULA_STORAGED_PORT)
time.sleep(1)
self.ch = ChromaHandler(path=CHROMA_PERSISTENT_PATH, collection_name=codebase_name)
def import_code(self, zip_file='', do_interpret=True):
'''
analyze code and save it to codekg and codedb
@return:
'''
# init graph to init tag and edge
code_importer = CodeImporter(engine=EMBEDDING_ENGINE, codebase_name=self.codebase_name,
nh=self.nh, ch=self.ch)
code_importer.init_graph()
time.sleep(5)
# crawl code
st0 = time.time()
logger.info('start crawl')
code_dict = self.crawl_code(zip_file)
logger.debug('crawl done, rt={}'.format(time.time() - st0))
# analyze code
logger.info('start analyze')
st1 = time.time()
code_analyzer = CodeAnalyzer(language=self.language)
static_analysis_res, interpretation = code_analyzer.analyze(code_dict, do_interpret=do_interpret)
logger.debug('analyze done, rt={}'.format(time.time() - st1))
# add info to nebula and chroma
st2 = time.time()
code_importer.import_code(static_analysis_res, interpretation, do_interpret=do_interpret)
logger.debug('update codebase done, rt={}'.format(time.time() - st2))
# get KG info
stat = self.nh.get_stat()
vertices_num, edges_num = stat['vertices'], stat['edges']
# get chroma info
file_num = self.ch.count()['result']
return vertices_num, edges_num, file_num
def delete_codebase(self, codebase_name: str):
'''
delete codebase
@param codebase_name: name of codebase
@return:
'''
self.nh.drop_space(space_name=codebase_name)
self.ch.delete_collection(collection_name=codebase_name)
def crawl_code(self, zip_file=''):
'''
@return:
'''
if self.language == 'java':
suffix = 'java'
logger.info(f'crawl_type={self.crawl_type}')
code_dict = {}
if self.crawl_type.lower() == 'zip':
code_dict = ZipCrawler.crawl(zip_file, output_path=self.code_path, suffix=suffix)
elif self.crawl_type.lower() == 'dir':
code_dict = DirCrawler.crawl(self.code_path, suffix)
return code_dict
def search_code(self, query: str, search_type: str, limit: int = 3):
'''
search code from codebase
@param limit:
@param engine:
@param query: query from user
@param search_type: ['cypher', 'graph', 'vector']
@return:
'''
assert search_type in ['cypher', 'tag', 'description']
code_search = CodeSearch(nh=self.nh, ch=self.ch, limit=limit)
if search_type == 'cypher':
search_res = code_search.search_by_cypher(query=query)
elif search_type == 'tag':
search_res = code_search.search_by_tag(query=query)
elif search_type == 'description':
search_res = code_search.search_by_desciption(query=query, engine=EMBEDDING_ENGINE)
context, related_vertice = self.format_search_res(search_res, search_type)
return context, related_vertice
def format_search_res(self, search_res: str, search_type: str):
'''
format search_res
@param search_res:
@param search_type:
@return:
'''
CYPHER_QA_PROMPT = '''
执行的 Cypher : {cypher}
Cypher 的结果是: {result}
'''
if search_type == 'cypher':
context = CYPHER_QA_PROMPT.format(cypher=search_res['cypher'], result=search_res['cypher_res'])
related_vertice = []
elif search_type == 'tag':
context = ''
related_vertice = []
for code in search_res:
context = context + code['code_text'] + '\n'
related_vertice.append(code['vertex'])
elif search_type == 'description':
context = ''
related_vertice = []
for code in search_res:
context = context + code['code_text'] + '\n'
related_vertice.append(code['vertex'])
return context, related_vertice
if __name__ == '__main__':
codebase_name = 'testing'
code_path = '/Users/bingxu/Desktop/工作/大模型/chatbot/test_code_repo/client'
cbh = CodeBaseHandler(codebase_name, code_path, crawl_type='dir')
# query = '使用不同的HTTP请求类型GET、POST、DELETE等来执行不同的操作'
# query = '代码中一共有多少个类'
query = 'intercept 函数作用是什么'
search_type = 'graph'
limit = 2
res = cbh.search_code(query, search_type, limit)
logger.debug(res)

View File

@ -1,9 +0,0 @@
from .configs import PHASE_CONFIGS
PHASE_LIST = list(PHASE_CONFIGS.keys())
__all__ = [
"PHASE_CONFIGS"
]

View File

@ -1,9 +0,0 @@
from .base_agent import BaseAgent
from .react_agent import ReactAgent
from .check_agent import CheckAgent
from .executor_agent import ExecutorAgent
from .selector_agent import SelectorAgent
__all__ = [
"BaseAgent", "ReactAgent", "CheckAgent", "ExecutorAgent", "SelectorAgent"
]

View File

@ -1,349 +0,0 @@
from pydantic import BaseModel
from typing import List, Union
import re
import copy
import json
import traceback
import uuid
from loguru import logger
from dev_opsgpt.connector.schema import (
Memory, Task, Env, Role, Message, ActionStatus, CodeDoc, Doc
)
from configs.server_config import SANDBOX_SERVER
from dev_opsgpt.sandbox import PyCodeBox, CodeBoxResponse
from dev_opsgpt.tools import DDGSTool, DocRetrieval, CodeRetrieval
from dev_opsgpt.connector.configs.prompts import BASE_PROMPT_INPUT, QUERY_CONTEXT_DOC_PROMPT_INPUT, BEGIN_PROMPT_INPUT
from dev_opsgpt.connector.message_process import MessageUtils
from dev_opsgpt.connector.configs.agent_config import REACT_PROMPT_INPUT, QUERY_CONTEXT_PROMPT_INPUT, PLAN_PROMPT_INPUT
from dev_opsgpt.llm_models import getChatModel, getExtraModel
from dev_opsgpt.connector.utils import parse_section
class BaseAgent:
def __init__(
self,
role: Role,
task: Task = None,
memory: Memory = None,
chat_turn: int = 1,
do_search: bool = False,
do_doc_retrieval: bool = False,
do_tool_retrieval: bool = False,
temperature: float = 0.2,
stop: Union[List[str], str] = None,
do_filter: bool = True,
do_use_self_memory: bool = True,
focus_agents: List[str] = [],
focus_message_keys: List[str] = [],
# prompt_mamnger: PromptManager
):
self.task = task
self.role = role
self.message_utils = MessageUtils(role)
self.llm = self.create_llm_engine(temperature, stop)
self.memory = self.init_history(memory)
self.chat_turn = chat_turn
self.do_search = do_search
self.do_doc_retrieval = do_doc_retrieval
self.do_tool_retrieval = do_tool_retrieval
self.focus_agents = focus_agents
self.focus_message_keys = focus_message_keys
self.do_filter = do_filter
self.do_use_self_memory = do_use_self_memory
# self.prompt_manager = None
def run(self, query: Message, history: Memory = None, background: Memory = None, memory_pool: Memory=None) -> Message:
'''agent reponse from multi-message'''
message = None
for message in self.arun(query, history, background, memory_pool):
pass
return message
def arun(self, query: Message, history: Memory = None, background: Memory = None, memory_pool: Memory=None) -> Message:
'''agent reponse from multi-message'''
# insert query into memory
query_c = copy.deepcopy(query)
query_c = self.start_action_step(query_c)
self_memory = self.memory if self.do_use_self_memory else None
# create your llm prompt
prompt = self.create_prompt(query_c, self_memory, history, background, memory_pool=memory_pool)
content = self.llm.predict(prompt)
logger.debug(f"{self.role.role_name} prompt: {prompt}")
logger.debug(f"{self.role.role_name} content: {content}")
output_message = Message(
role_name=self.role.role_name,
role_type="ai", #self.role.role_type,
role_content=content,
step_content=content,
input_query=query_c.input_query,
tools=query_c.tools,
parsed_output_list=[query.parsed_output],
customed_kargs=query_c.customed_kargs
)
# common parse llm' content to message
output_message = self.message_utils.parser(output_message)
if self.do_filter:
output_message = self.message_utils.filter(output_message)
# action step
output_message, observation_message = self.message_utils.step_router(output_message, history, background, memory_pool=memory_pool)
output_message.parsed_output_list.append(output_message.parsed_output)
if observation_message:
output_message.parsed_output_list.append(observation_message.parsed_output)
# update self_memory
self.append_history(query_c)
self.append_history(output_message)
# logger.info(f"{self.role.role_name} currenct question: {output_message.input_query}\nllm_step_run: {output_message.role_content}")
output_message.input_query = output_message.role_content
# output_message.parsed_output_list.append(output_message.parsed_output) # 与上述重复?
# end
output_message = self.message_utils.inherit_extrainfo(query, output_message)
output_message = self.end_action_step(output_message)
# update memory pool
memory_pool.append(output_message)
yield output_message
def create_prompt(
self, query: Message, memory: Memory =None, history: Memory = None, background: Memory = None, memory_pool: Memory=None, prompt_mamnger=None) -> str:
'''
prompt engineer, contains role\task\tools\docs\memory
'''
#
doc_infos = self.create_doc_prompt(query)
code_infos = self.create_codedoc_prompt(query)
#
formatted_tools, tool_names, _ = self.create_tools_prompt(query)
task_prompt = self.create_task_prompt(query)
background_prompt = self.create_background_prompt(background, control_key="step_content")
history_prompt = self.create_history_prompt(history)
selfmemory_prompt = self.create_selfmemory_prompt(memory, control_key="step_content")
# extra_system_prompt = self.role.role_prompt
prompt = self.role.role_prompt.format(**{"formatted_tools": formatted_tools, "tool_names": tool_names})
#
memory_pool_select_by_agent_key = self.select_memory_by_agent_key(memory_pool)
memory_pool_select_by_agent_key_context = '\n\n'.join([f"*{k}*\n{v}" for parsed_output in memory_pool_select_by_agent_key.get_parserd_output_list() for k, v in parsed_output.items() if k not in ['Action Status']])
# input_query = query.input_query
# # logger.debug(f"{self.role.role_name} extra_system_prompt: {self.role.role_prompt}")
# # logger.debug(f"{self.role.role_name} input_query: {input_query}")
# # logger.debug(f"{self.role.role_name} doc_infos: {doc_infos}")
# # logger.debug(f"{self.role.role_name} tool_names: {tool_names}")
# if "**Context:**" in self.role.role_prompt:
# # logger.debug(f"parsed_output_list: {query.parsed_output_list}")
# # input_query = "'''" + "\n".join([f"###{k}###\n{v}" for i in query.parsed_output_list for k,v in i.items() if "Action Status" !=k]) + "'''"
# context = "\n".join([f"*{k}*\n{v}" for i in query.parsed_output_list for k,v in i.items() if "Action Status" !=k])
# # context = history_prompt or '""'
# # logger.debug(f"parsed_output_list: {t}")
# prompt += "\n" + QUERY_CONTEXT_PROMPT_INPUT.format(**{"context": context, "query": query.origin_query})
# else:
# prompt += "\n" + PLAN_PROMPT_INPUT.format(**{"query": input_query})
task = query.task or self.task
if task_prompt is not None:
prompt += "\n" + task.task_prompt
DocInfos = ""
if doc_infos is not None and doc_infos!="" and doc_infos!="不存在知识库辅助信息":
DocInfos += f"\nDocument Information: {doc_infos}"
if code_infos is not None and code_infos!="" and code_infos!="不存在代码库辅助信息":
DocInfos += f"\nCodeBase Infomation: {code_infos}"
# if selfmemory_prompt:
# prompt += "\n" + selfmemory_prompt
# if background_prompt:
# prompt += "\n" + background_prompt
# if history_prompt:
# prompt += "\n" + history_prompt
input_query = query.input_query
# logger.debug(f"{self.role.role_name} extra_system_prompt: {self.role.role_prompt}")
# logger.debug(f"{self.role.role_name} input_query: {input_query}")
# logger.debug(f"{self.role.role_name} doc_infos: {doc_infos}")
# logger.debug(f"{self.role.role_name} tool_names: {tool_names}")
# extra_system_prompt = self.role.role_prompt
input_keys = parse_section(self.role.role_prompt, 'Input Format')
prompt = self.role.role_prompt.format(**{"formatted_tools": formatted_tools, "tool_names": tool_names})
prompt += "\n" + BEGIN_PROMPT_INPUT
for input_key in input_keys:
if input_key == "Origin Query":
prompt += "\n**Origin Query:**\n" + query.origin_query
elif input_key == "Context":
context = "\n".join([f"*{k}*\n{v}" for i in query.parsed_output_list for k,v in i.items() if "Action Status" !=k])
if history:
context = history_prompt + "\n" + context
if not context:
context = "there is no context"
if self.focus_agents and memory_pool_select_by_agent_key_context:
context = memory_pool_select_by_agent_key_context
prompt += "\n**Context:**\n" + context + "\n" + input_query
elif input_key == "DocInfos":
if DocInfos:
prompt += "\n**DocInfos:**\n" + DocInfos
else:
prompt += "\n**DocInfos:**\n" + "Empty"
elif input_key == "Question":
prompt += "\n**Question:**\n" + input_query
# if "**Context:**" in self.role.role_prompt:
# # logger.debug(f"parsed_output_list: {query.parsed_output_list}")
# # input_query = "'''" + "\n".join([f"###{k}###\n{v}" for i in query.parsed_output_list for k,v in i.items() if "Action Status" !=k]) + "'''"
# context = "\n".join([f"*{k}*\n{v}" for i in query.parsed_output_list for k,v in i.items() if "Action Status" !=k])
# if history:
# context = history_prompt + "\n" + context
# if not context:
# context = "there is no context"
# # logger.debug(f"parsed_output_list: {t}")
# if "DocInfos" in prompt:
# prompt += "\n" + QUERY_CONTEXT_DOC_PROMPT_INPUT.format(**{"context": context, "query": query.origin_query, "DocInfos": DocInfos})
# else:
# prompt += "\n" + QUERY_CONTEXT_PROMPT_INPUT.format(**{"context": context, "query": query.origin_query, "DocInfos": DocInfos})
# else:
# prompt += "\n" + BASE_PROMPT_INPUT.format(**{"query": input_query})
# prompt = extra_system_prompt.format(**{"query": input_query, "doc_infos": doc_infos, "formatted_tools": formatted_tools, "tool_names": tool_names})
while "{{" in prompt or "}}" in prompt:
prompt = prompt.replace("{{", "{")
prompt = prompt.replace("}}", "}")
# logger.debug(f"{self.role.role_name} prompt: {prompt}")
return prompt
def create_doc_prompt(self, message: Message) -> str:
''''''
db_docs = message.db_docs
search_docs = message.search_docs
doc_infos = "\n".join([doc.get_snippet() for doc in db_docs] + [doc.get_snippet() for doc in search_docs])
return doc_infos or "不存在知识库辅助信息"
def create_codedoc_prompt(self, message: Message) -> str:
''''''
code_docs = message.code_docs
doc_infos = "\n".join([doc.get_code() for doc in code_docs])
return doc_infos or "不存在代码库辅助信息"
def create_tools_prompt(self, message: Message) -> str:
tools = message.tools
tool_strings = []
tools_descs = []
for tool in tools:
args_schema = re.sub("}", "}}}}", re.sub("{", "{{{{", str(tool.args)))
tool_strings.append(f"{tool.name}: {tool.description}, args: {args_schema}")
tools_descs.append(f"{tool.name}: {tool.description}")
formatted_tools = "\n".join(tool_strings)
tools_desc_str = "\n".join(tools_descs)
tool_names = ", ".join([tool.name for tool in tools])
return formatted_tools, tool_names, tools_desc_str
def create_task_prompt(self, message: Message) -> str:
task = message.task or self.task
return "\n任务目标: " + task.task_prompt if task is not None else None
def create_background_prompt(self, background: Memory, control_key="role_content") -> str:
background_message = None if background is None else background.to_str_messages(content_key=control_key)
# logger.debug(f"background_message: {background_message}")
if background_message:
background_message = re.sub("}", "}}", re.sub("{", "{{", background_message))
return "\n背景信息: " + background_message if background_message else None
def create_history_prompt(self, history: Memory, control_key="role_content") -> str:
history_message = None if history is None else history.to_str_messages(content_key=control_key)
if history_message:
history_message = re.sub("}", "}}", re.sub("{", "{{", history_message))
return "\n补充对话信息: " + history_message if history_message else None
def create_selfmemory_prompt(self, selfmemory: Memory, control_key="role_content") -> str:
selfmemory_message = None if selfmemory is None else selfmemory.to_str_messages(content_key=control_key)
if selfmemory_message:
selfmemory_message = re.sub("}", "}}", re.sub("{", "{{", selfmemory_message))
return "\n补充自身对话信息: " + selfmemory_message if selfmemory_message else None
def init_history(self, memory: Memory = None) -> Memory:
return Memory(messages=[])
def update_history(self, message: Message):
self.memory.append(message)
def append_history(self, message: Message):
self.memory.append(message)
def clear_history(self, ):
self.memory.clear()
self.memory = self.init_history()
def create_llm_engine(self, temperature=0.2, stop=None):
return getChatModel(temperature=temperature, stop=stop)
def registry_actions(self, actions):
'''registry llm's actions'''
self.action_list = actions
def start_action_step(self, message: Message) -> Message:
'''do action before agent predict '''
# action_json = self.start_action()
# message["customed_kargs"]["xx"] = action_json
return message
def end_action_step(self, message: Message) -> Message:
'''do action after agent predict '''
# action_json = self.end_action()
# message["customed_kargs"]["xx"] = action_json
return message
def token_usage(self, ):
'''calculate the usage of token'''
pass
def select_memory_by_key(self, memory: Memory) -> Memory:
return Memory(
messages=[self.select_message_by_key(message) for message in memory.messages
if self.select_message_by_key(message) is not None]
)
def select_memory_by_agent_key(self, memory: Memory) -> Memory:
return Memory(
messages=[self.select_message_by_agent_key(message) for message in memory.messages
if self.select_message_by_agent_key(message) is not None]
)
def select_message_by_agent_key(self, message: Message) -> Message:
# assume we focus all agents
if self.focus_agents == []:
return message
return None if message is None or message.role_name not in self.focus_agents else self.select_message_by_key(message)
def select_message_by_key(self, message: Message) -> Message:
# assume we focus all key contents
if message is None:
return message
if self.focus_message_keys == []:
return message
message_c = copy.deepcopy(message)
message_c.parsed_output = {k: v for k,v in message_c.parsed_output.items() if k in self.focus_message_keys}
message_c.parsed_output_list = [{k: v for k,v in parsed_output.items() if k in self.focus_message_keys} for parsed_output in message_c.parsed_output_list]
return message_c
def get_memory(self, content_key="role_content"):
return self.memory.to_tuple_messages(content_key="step_content")
def get_memory_str(self, content_key="role_content"):
return "\n".join([": ".join(i) for i in self.memory.to_tuple_messages(content_key="step_content")])

View File

@ -1,110 +0,0 @@
from pydantic import BaseModel
from typing import List, Union
import re
import json
import traceback
import copy
from loguru import logger
from langchain.prompts.chat import ChatPromptTemplate
from dev_opsgpt.connector.schema import (
Memory, Task, Env, Role, Message, ActionStatus
)
from dev_opsgpt.llm_models import getChatModel
from dev_opsgpt.connector.configs.agent_config import REACT_PROMPT_INPUT, CONTEXT_PROMPT_INPUT, QUERY_CONTEXT_PROMPT_INPUT
from .base_agent import BaseAgent
class CheckAgent(BaseAgent):
def __init__(
self,
role: Role,
task: Task = None,
memory: Memory = None,
chat_turn: int = 1,
do_search: bool = False,
do_doc_retrieval: bool = False,
do_tool_retrieval: bool = False,
temperature: float = 0.2,
stop: Union[List[str], str] = None,
do_filter: bool = True,
do_use_self_memory: bool = True,
focus_agents: List[str] = [],
focus_message_keys: List[str] = [],
# prompt_mamnger: PromptManager
):
super().__init__(role, task, memory, chat_turn, do_search, do_doc_retrieval,
do_tool_retrieval, temperature, stop, do_filter,do_use_self_memory,
focus_agents, focus_message_keys
)
def create_prompt(
self, query: Message, memory: Memory =None, history: Memory = None, background: Memory = None, memory_pool: Memory=None, prompt_mamnger=None) -> str:
'''
role\task\tools\docs\memory
'''
#
doc_infos = self.create_doc_prompt(query)
code_infos = self.create_codedoc_prompt(query)
#
formatted_tools, tool_names, _ = self.create_tools_prompt(query)
task_prompt = self.create_task_prompt(query)
background_prompt = self.create_background_prompt(background)
history_prompt = self.create_history_prompt(history)
selfmemory_prompt = self.create_selfmemory_prompt(memory, control_key="step_content")
# react 流程是自身迭代过程,另外二次触发的是需要作为历史对话信息
# input_query = react_memory.to_tuple_messages(content_key="step_content")
input_query = query.input_query
# logger.debug(f"{self.role.role_name} extra_system_prompt: {self.role.role_prompt}")
# logger.debug(f"{self.role.role_name} input_query: {input_query}")
# logger.debug(f"{self.role.role_name} doc_infos: {doc_infos}")
# logger.debug(f"{self.role.role_name} tool_names: {tool_names}")
# prompt += "\n" + CHECK_PROMPT_INPUT.format(**{"query": input_query})
# prompt.format(**{"query": input_query})
# extra_system_prompt = self.role.role_prompt
prompt = self.role.role_prompt.format(**{"query": input_query, "formatted_tools": formatted_tools, "tool_names": tool_names})
if "**Context:**" in self.role.role_prompt:
# logger.debug(f"parsed_output_list: {query.parsed_output_list}")
# input_query = "'''" + "\n".join([f"*{k}*\n{v}" for i in background.get_parserd_output_list() for k,v in i.items() if "Action Status" !=k]) + "'''"
context = "\n".join([f"*{k}*\n{v}" for i in background.get_parserd_output_list() for k,v in i.items() if "Action Status" !=k])
# logger.debug(context)
# logger.debug(f"parsed_output_list: {t}")
prompt += "\n" + QUERY_CONTEXT_PROMPT_INPUT.format(**{"query": query.origin_query, "context": context})
else:
prompt += "\n" + REACT_PROMPT_INPUT.format(**{"query": input_query})
task = query.task or self.task
if task_prompt is not None:
prompt += "\n" + task.task_prompt
# if doc_infos is not None and doc_infos!="" and doc_infos!="不存在知识库辅助信息":
# prompt += f"\n知识库信息: {doc_infos}"
# if code_infos is not None and code_infos!="" and code_infos!="不存在代码库辅助信息":
# prompt += f"\n代码库信息: {code_infos}"
# if background_prompt:
# prompt += "\n" + background_prompt
# if history_prompt:
# prompt += "\n" + history_prompt
# if selfmemory_prompt:
# prompt += "\n" + selfmemory_prompt
# prompt = extra_system_prompt.format(**{"query": input_query, "doc_infos": doc_infos, "formatted_tools": formatted_tools, "tool_names": tool_names})
while "{{" in prompt or "}}" in prompt:
prompt = prompt.replace("{{", "{")
prompt = prompt.replace("}}", "}")
# logger.debug(f"{self.role.role_name} prompt: {prompt}")
return prompt

View File

@ -1,217 +0,0 @@
from pydantic import BaseModel
from typing import List, Union, Tuple, Any
import re
import json
import traceback
import copy
from loguru import logger
from langchain.prompts.chat import ChatPromptTemplate
from dev_opsgpt.connector.schema import (
Memory, Task, Env, Role, Message, ActionStatus
)
from dev_opsgpt.llm_models import getChatModel
from dev_opsgpt.connector.configs.prompts import EXECUTOR_PROMPT_INPUT, BEGIN_PROMPT_INPUT
from dev_opsgpt.connector.utils import parse_section
from .base_agent import BaseAgent
class ExecutorAgent(BaseAgent):
def __init__(
self,
role: Role,
task: Task = None,
memory: Memory = None,
chat_turn: int = 1,
do_search: bool = False,
do_doc_retrieval: bool = False,
do_tool_retrieval: bool = False,
temperature: float = 0.2,
stop: Union[List[str], str] = None,
do_filter: bool = True,
do_use_self_memory: bool = True,
focus_agents: List[str] = [],
focus_message_keys: List[str] = [],
# prompt_mamnger: PromptManager
):
super().__init__(role, task, memory, chat_turn, do_search, do_doc_retrieval,
do_tool_retrieval, temperature, stop, do_filter,do_use_self_memory,
focus_agents, focus_message_keys
)
self.do_all_task = True # run all tasks
def arun(self, query: Message, history: Memory = None, background: Memory = None, memory_pool: Memory=None) -> Message:
'''agent reponse from multi-message'''
# insert query into memory
task_executor_memory = Memory(messages=[])
# insert query
output_message = Message(
role_name=self.role.role_name,
role_type="ai", #self.role.role_type,
role_content=query.input_query,
step_content="",
input_query=query.input_query,
tools=query.tools,
parsed_output_list=[query.parsed_output],
customed_kargs=query.customed_kargs
)
self_memory = self.memory if self.do_use_self_memory else None
plan_step = int(query.parsed_output.get("PLAN_STEP", 0))
# 如果存在plan字段且plan字段为str的时候
if "PLAN" not in query.parsed_output or isinstance(query.parsed_output.get("PLAN", []), str) or plan_step >= len(query.parsed_output.get("PLAN", [])):
query_c = copy.deepcopy(query)
query_c = self.start_action_step(query_c)
query_c.parsed_output = {"Question": query_c.input_query}
task_executor_memory.append(query_c)
for output_message, task_executor_memory in self._arun_step(output_message, query_c, self_memory, history, background, memory_pool, task_executor_memory):
pass
# task_executor_memory.append(query_c)
# content = "the execution step of the plan is exceed the planned scope."
# output_message.parsed_dict = {"Thought": content, "Action Status": "finished", "Action": content}
# task_executor_memory.append(output_message)
elif "PLAN" in query.parsed_output:
logger.debug(f"{query.parsed_output['PLAN']}")
if self.do_all_task:
# run all tasks step by step
for task_content in query.parsed_output["PLAN"][plan_step:]:
# create your llm prompt
query_c = copy.deepcopy(query)
query_c.parsed_output = {"Question": task_content}
task_executor_memory.append(query_c)
for output_message, task_executor_memory in self._arun_step(output_message, query_c, self_memory, history, background, memory_pool, task_executor_memory):
pass
yield output_message
else:
query_c = copy.deepcopy(query)
query_c = self.start_action_step(query_c)
task_content = query_c.parsed_output["PLAN"][plan_step]
query_c.parsed_output = {"Question": task_content}
task_executor_memory.append(query_c)
for output_message, task_executor_memory in self._arun_step(output_message, query_c, self_memory, history, background, memory_pool, task_executor_memory):
pass
output_message.parsed_output.update({"CURRENT_STEP": plan_step})
# update self_memory
self.append_history(query)
self.append_history(output_message)
# logger.info(f"{self.role.role_name} currenct question: {output_message.input_query}\nllm_executor_run: {output_message.step_content}")
# logger.info(f"{self.role.role_name} currenct parserd_output_list: {output_message.parserd_output_list}")
output_message.input_query = output_message.role_content
# end_action_step
output_message = self.end_action_step(output_message)
# update memory pool
memory_pool.append(output_message)
yield output_message
def _arun_step(self, output_message: Message, query: Message, self_memory: Memory,
history: Memory, background: Memory, memory_pool: Memory,
react_memory: Memory) -> Union[Message, Memory]:
'''execute the llm predict by created prompt'''
prompt = self.create_prompt(query, self_memory, history, background, memory_pool=memory_pool, react_memory=react_memory)
content = self.llm.predict(prompt)
# logger.debug(f"{self.role.role_name} prompt: {prompt}")
logger.debug(f"{self.role.role_name} content: {content}")
output_message.role_content = content
output_message.step_content += "\n"+output_message.role_content
output_message = self.message_utils.parser(output_message)
# according the output to choose one action for code_content or tool_content
output_message, observation_message = self.message_utils.step_router(output_message)
# logger.debug(f"{self.role.role_name} content: {content}")
# update parserd_output_list
output_message.parsed_output_list.append(output_message.parsed_output)
react_message = copy.deepcopy(output_message)
react_memory.append(react_message)
if observation_message:
react_memory.append(observation_message)
output_message.parsed_output_list.append(observation_message.parsed_output)
logger.debug(f"{observation_message.role_name} content: {observation_message.role_content}")
yield output_message, react_memory
def create_prompt(
self, query: Message, memory: Memory =None, history: Memory = None, background: Memory = None, memory_pool: Memory=None, react_memory: Memory = None, prompt_mamnger=None) -> str:
'''
role\task\tools\docs\memory
'''
#
doc_infos = self.create_doc_prompt(query)
code_infos = self.create_codedoc_prompt(query)
#
formatted_tools, tool_names, _ = self.create_tools_prompt(query)
task_prompt = self.create_task_prompt(query)
background_prompt = self.create_background_prompt(background, control_key="step_content")
history_prompt = self.create_history_prompt(history)
selfmemory_prompt = self.create_selfmemory_prompt(memory, control_key="step_content")
#
memory_pool_select_by_agent_key = self.select_memory_by_agent_key(memory_pool)
memory_pool_select_by_agent_key_context = '\n\n'.join([
f"*{k}*\n{v}" for parsed_output in memory_pool_select_by_agent_key.get_parserd_output_list() for k, v in parsed_output.items() if k not in ['Action Status']
])
DocInfos = ""
if doc_infos is not None and doc_infos!="" and doc_infos!="不存在知识库辅助信息":
DocInfos += f"\nDocument Information: {doc_infos}"
if code_infos is not None and code_infos!="" and code_infos!="不存在代码库辅助信息":
DocInfos += f"\nCodeBase Infomation: {code_infos}"
# extra_system_prompt = self.role.role_prompt
prompt = self.role.role_prompt.format(**{"formatted_tools": formatted_tools, "tool_names": tool_names})
# input_query = react_memory.to_tuple_messages(content_key="role_content")
# logger.debug(f"get_parserd_dict {react_memory.get_parserd_output()}")
input_query = "\n".join(["\n".join([f"**{k}:**\n{v}" for k,v in _dict.items()]) for _dict in react_memory.get_parserd_output()])
# input_query = query.input_query + "\n".join([f"{v}" for k, v in input_query if v])
last_agent_parsed_output = "\n".join(["\n".join([f"*{k}*\n{v}" for k,v in _dict.items()]) for _dict in query.parsed_output_list])
react_parsed_output = "\n".join(["\n".join([f"*{k}_context*\n{v}" for k,v in _dict.items()]) for _dict in react_memory.get_parserd_output()[:-1]])
#
prompt += "\n" + BEGIN_PROMPT_INPUT
input_keys = parse_section(self.role.role_prompt, 'Input Format')
if input_keys:
for input_key in input_keys:
if input_key == "Origin Query":
prompt += "\n**Origin Query:**\n" + query.origin_query
elif input_key == "DocInfos":
prompt += "\n**DocInfos:**\n" + DocInfos
elif input_key == "Context":
if self.focus_agents and memory_pool_select_by_agent_key_context:
context = memory_pool_select_by_agent_key_context
else:
context = last_agent_parsed_output
prompt += "\n**Context:**\n" + context + f"\n{react_parsed_output}"
elif input_key == "Question":
prompt += "\n**Question:**\n" + query.parsed_output.get("Question")
else:
prompt += "\n" + input_query
task = query.task or self.task
# if task_prompt is not None:
# prompt += "\n" + task.task_prompt
# if selfmemory_prompt:
# prompt += "\n" + selfmemory_prompt
# if background_prompt:
# prompt += "\n" + background_prompt
# if history_prompt:
# prompt += "\n" + history_prompt
# prompt = extra_system_prompt.format(**{"query": input_query, "doc_infos": doc_infos, "formatted_tools": formatted_tools, "tool_names": tool_names})
while "{{" in prompt or "}}" in prompt:
prompt = prompt.replace("{{", "{")
prompt = prompt.replace("}}", "}")
return prompt
def set_task(self, do_all_task):
'''set task exec type'''
self.do_all_task = do_all_task

View File

@ -1,178 +0,0 @@
from pydantic import BaseModel
from typing import List, Union
import re
import json
import traceback
import copy
from loguru import logger
from langchain.prompts.chat import ChatPromptTemplate
from dev_opsgpt.connector.schema import (
Memory, Task, Env, Role, Message, ActionStatus
)
from dev_opsgpt.llm_models import getChatModel
from dev_opsgpt.connector.configs.agent_config import REACT_PROMPT_INPUT
from .base_agent import BaseAgent
class ReactAgent(BaseAgent):
def __init__(
self,
role: Role,
task: Task = None,
memory: Memory = None,
chat_turn: int = 1,
do_search: bool = False,
do_doc_retrieval: bool = False,
do_tool_retrieval: bool = False,
temperature: float = 0.2,
stop: Union[List[str], str] = None,
do_filter: bool = True,
do_use_self_memory: bool = True,
focus_agents: List[str] = [],
focus_message_keys: List[str] = [],
# prompt_mamnger: PromptManager
):
super().__init__(role, task, memory, chat_turn, do_search, do_doc_retrieval,
do_tool_retrieval, temperature, stop, do_filter,do_use_self_memory,
focus_agents, focus_message_keys
)
def run(self, query: Message, history: Memory = None, background: Memory = None, memory_pool: Memory = None) -> Message:
'''agent reponse from multi-message'''
for message in self.arun(query, history, background, memory_pool):
pass
return message
def arun(self, query: Message, history: Memory = None, background: Memory = None, memory_pool: Memory = None) -> Message:
'''agent reponse from multi-message'''
step_nums = copy.deepcopy(self.chat_turn)
react_memory = Memory(messages=[])
# insert query
output_message = Message(
role_name=self.role.role_name,
role_type="ai", #self.role.role_type,
role_content=query.input_query,
step_content="",
input_query=query.input_query,
tools=query.tools,
parsed_output_list=[query.parsed_output],
customed_kargs=query.customed_kargs
)
query_c = copy.deepcopy(query)
query_c = self.start_action_step(query_c)
if query.parsed_output:
query_c.parsed_output = {"Question": "\n".join([f"{v}" for k, v in query.parsed_output.items() if k not in ["Action Status"]])}
else:
query_c.parsed_output = {"Question": query.input_query}
react_memory.append(query_c)
self_memory = self.memory if self.do_use_self_memory else None
idx = 0
# start to react
while step_nums > 0:
output_message.role_content = output_message.step_content
prompt = self.create_prompt(query, self_memory, history, background, react_memory, memory_pool)
try:
content = self.llm.predict(prompt)
except Exception as e:
logger.warning(f"error prompt: {prompt}")
raise Exception(traceback.format_exc())
output_message.role_content = "\n"+content
output_message.step_content += "\n"+output_message.role_content
yield output_message
# logger.debug(f"{self.role.role_name}, {idx} iteration prompt: {prompt}")
logger.info(f"{self.role.role_name}, {idx} iteration step_run: {output_message.role_content}")
output_message = self.message_utils.parser(output_message)
# when get finished signal can stop early
if output_message.action_status == ActionStatus.FINISHED or output_message.action_status == ActionStatus.STOPED: break
# according the output to choose one action for code_content or tool_content
output_message, observation_message = self.message_utils.step_router(output_message)
output_message.parsed_output_list.append(output_message.parsed_output)
react_message = copy.deepcopy(output_message)
react_memory.append(react_message)
if observation_message:
react_memory.append(observation_message)
output_message.parsed_output_list.append(observation_message.parsed_output)
# logger.debug(f"{observation_message.role_name} content: {observation_message.role_content}")
# logger.info(f"{self.role.role_name} currenct question: {output_message.input_query}\nllm_react_run: {output_message.role_content}")
idx += 1
step_nums -= 1
yield output_message
# react' self_memory saved at last
self.append_history(output_message)
# update memory pool
# memory_pool.append(output_message)
output_message.input_query = query.input_query
# end_action_step
output_message = self.end_action_step(output_message)
# update memory pool
memory_pool.append(output_message)
yield output_message
def create_prompt(
self, query: Message, memory: Memory =None, history: Memory = None, background: Memory = None, react_memory: Memory = None, memory_pool: Memory= None,
prompt_mamnger=None) -> str:
'''
role\task\tools\docs\memory
'''
#
doc_infos = self.create_doc_prompt(query)
code_infos = self.create_codedoc_prompt(query)
#
formatted_tools, tool_names, _ = self.create_tools_prompt(query)
task_prompt = self.create_task_prompt(query)
background_prompt = self.create_background_prompt(background)
history_prompt = self.create_history_prompt(history)
selfmemory_prompt = self.create_selfmemory_prompt(memory, control_key="step_content")
#
# extra_system_prompt = self.role.role_prompt
prompt = self.role.role_prompt.format(**{"formatted_tools": formatted_tools, "tool_names": tool_names})
# react 流程是自身迭代过程,另外二次触发的是需要作为历史对话信息
# input_query = react_memory.to_tuple_messages(content_key="step_content")
# # input_query = query.input_query + "\n" + "\n".join([f"{v}" for k, v in input_query if v])
# input_query = "\n".join([f"{v}" for k, v in input_query if v])
input_query = "\n".join(["\n".join([f"**{k}:**\n{v}" for k,v in _dict.items()]) for _dict in react_memory.get_parserd_output()])
# logger.debug(f"input_query: {input_query}")
prompt += "\n" + REACT_PROMPT_INPUT.format(**{"query": input_query})
task = query.task or self.task
# if task_prompt is not None:
# prompt += "\n" + task.task_prompt
# if doc_infos is not None and doc_infos!="" and doc_infos!="不存在知识库辅助信息":
# prompt += f"\n知识库信息: {doc_infos}"
# if code_infos is not None and code_infos!="" and code_infos!="不存在代码库辅助信息":
# prompt += f"\n代码库信息: {code_infos}"
# if background_prompt:
# prompt += "\n" + background_prompt
# if history_prompt:
# prompt += "\n" + history_prompt
# if selfmemory_prompt:
# prompt += "\n" + selfmemory_prompt
# logger.debug(f"{self.role.role_name} extra_system_prompt: {self.role.role_prompt}")
# logger.debug(f"{self.role.role_name} input_query: {input_query}")
# logger.debug(f"{self.role.role_name} doc_infos: {doc_infos}")
# logger.debug(f"{self.role.role_name} tool_names: {tool_names}")
# prompt += "\n" + REACT_PROMPT_INPUT.format(**{"query": input_query})
# prompt = extra_system_prompt.format(**{"query": input_query, "doc_infos": doc_infos, "formatted_tools": formatted_tools, "tool_names": tool_names})
while "{{" in prompt or "}}" in prompt:
prompt = prompt.replace("{{", "{")
prompt = prompt.replace("}}", "}")
return prompt

View File

@ -1,165 +0,0 @@
from pydantic import BaseModel
from typing import List, Union
import re
import json
import traceback
import copy
import random
from loguru import logger
from langchain.prompts.chat import ChatPromptTemplate
from dev_opsgpt.connector.schema import (
Memory, Task, Env, Role, Message, ActionStatus
)
from dev_opsgpt.llm_models import getChatModel
from dev_opsgpt.connector.configs.prompts import BASE_PROMPT_INPUT, QUERY_CONTEXT_DOC_PROMPT_INPUT, BEGIN_PROMPT_INPUT
from dev_opsgpt.connector.utils import parse_section
from .base_agent import BaseAgent
class SelectorAgent(BaseAgent):
def __init__(
self,
role: Role,
task: Task = None,
memory: Memory = None,
chat_turn: int = 1,
do_search: bool = False,
do_doc_retrieval: bool = False,
do_tool_retrieval: bool = False,
temperature: float = 0.2,
stop: Union[List[str], str] = None,
do_filter: bool = True,
do_use_self_memory: bool = True,
focus_agents: List[str] = [],
focus_message_keys: List[str] = [],
group_agents: List[BaseAgent] = [],
# prompt_mamnger: PromptManager
):
super().__init__(role, task, memory, chat_turn, do_search, do_doc_retrieval,
do_tool_retrieval, temperature, stop, do_filter,do_use_self_memory,
focus_agents, focus_message_keys
)
self.group_agents = group_agents
def arun(self, query: Message, history: Memory = None, background: Memory = None, memory_pool: Memory=None) -> Message:
'''agent reponse from multi-message'''
# insert query into memory
query_c = copy.deepcopy(query)
query = self.start_action_step(query)
self_memory = self.memory if self.do_use_self_memory else None
# create your llm prompt
prompt = self.create_prompt(query_c, self_memory, history, background, memory_pool=memory_pool)
content = self.llm.predict(prompt)
logger.debug(f"{self.role.role_name} prompt: {prompt}")
logger.debug(f"{self.role.role_name} content: {content}")
# select agent
select_message = Message(
role_name=self.role.role_name,
role_type="ai", #self.role.role_type,
role_content=content,
step_content=content,
input_query=query_c.input_query,
tools=query_c.tools,
parsed_output_list=[query.parsed_output]
)
# common parse llm' content to message
select_message = self.message_utils.parser(select_message)
if self.do_filter:
select_message = self.message_utils.filter(select_message)
output_message = None
if select_message.parsed_output.get("Role", "") in [agent.role.role_name for agent in self.group_agents]:
for agent in self.group_agents:
if agent.role.role_name == select_message.parsed_output.get("Role", ""):
break
for output_message in agent.arun(query, history, background=background, memory_pool=memory_pool):
pass
# update self_memory
self.append_history(query_c)
self.append_history(output_message)
logger.info(f"{agent.role.role_name} currenct question: {output_message.input_query}\nllm_step_run: {output_message.role_content}")
output_message.input_query = output_message.role_content
output_message.parsed_output_list.append(output_message.parsed_output)
#
output_message = self.end_action_step(output_message)
# update memory pool
memory_pool.append(output_message)
yield output_message or select_message
def create_prompt(
self, query: Message, memory: Memory =None, history: Memory = None, background: Memory = None, memory_pool: Memory=None, prompt_mamnger=None) -> str:
'''
role\task\tools\docs\memory
'''
#
doc_infos = self.create_doc_prompt(query)
code_infos = self.create_codedoc_prompt(query)
#
formatted_tools, tool_names, tools_descs = self.create_tools_prompt(query)
agent_names, agents = self.create_agent_names()
task_prompt = self.create_task_prompt(query)
background_prompt = self.create_background_prompt(background)
history_prompt = self.create_history_prompt(history)
selfmemory_prompt = self.create_selfmemory_prompt(memory, control_key="step_content")
DocInfos = ""
if doc_infos is not None and doc_infos!="" and doc_infos!="不存在知识库辅助信息":
DocInfos += f"\nDocument Information: {doc_infos}"
if code_infos is not None and code_infos!="" and code_infos!="不存在代码库辅助信息":
DocInfos += f"\nCodeBase Infomation: {code_infos}"
input_query = query.input_query
logger.debug(f"{self.role.role_name} input_query: {input_query}")
prompt = self.role.role_prompt.format(**{"agent_names": agent_names, "agents": agents, "formatted_tools": tools_descs, "tool_names": tool_names})
#
memory_pool_select_by_agent_key = self.select_memory_by_agent_key(memory_pool)
memory_pool_select_by_agent_key_context = '\n\n'.join([f"*{k}*\n{v}" for parsed_output in memory_pool_select_by_agent_key.get_parserd_output_list() for k, v in parsed_output.items() if k not in ['Action Status']])
input_keys = parse_section(self.role.role_prompt, 'Input Format')
#
prompt += "\n" + BEGIN_PROMPT_INPUT
for input_key in input_keys:
if input_key == "Origin Query":
prompt += "\n**Origin Query:**\n" + query.origin_query
elif input_key == "Context":
context = "\n".join([f"*{k}*\n{v}" for i in query.parsed_output_list for k,v in i.items() if "Action Status" !=k])
if history:
context = history_prompt + "\n" + context
if not context:
context = "there is no context"
if self.focus_agents and memory_pool_select_by_agent_key_context:
context = memory_pool_select_by_agent_key_context
prompt += "\n**Context:**\n" + context + "\n" + input_query
elif input_key == "DocInfos":
prompt += "\n**DocInfos:**\n" + DocInfos
elif input_key == "Question":
prompt += "\n**Question:**\n" + input_query
while "{{" in prompt or "}}" in prompt:
prompt = prompt.replace("{{", "{")
prompt = prompt.replace("}}", "}")
# logger.debug(f"{self.role.role_name} prompt: {prompt}")
return prompt
def create_agent_names(self):
random.shuffle(self.group_agents)
agent_names = ", ".join([f'{agent.role.role_name}' for agent in self.group_agents])
agent_descs = []
for agent in self.group_agents:
role_desc = agent.role.role_prompt.split("####")[1]
while "\n\n" in role_desc:
role_desc = role_desc.replace("\n\n", "\n")
role_desc = role_desc.replace("\n", ",")
agent_descs.append(f'"role name: {agent.role.role_name}\nrole description: {role_desc}"')
return agent_names, "\n".join(agent_descs)

View File

@ -1,5 +0,0 @@
from .base_chain import BaseChain
__all__ = [
"BaseChain"
]

View File

@ -1,120 +0,0 @@
from pydantic import BaseModel
from typing import List
import json
import re
from loguru import logger
import traceback
import uuid
import copy
from dev_opsgpt.connector.agents import BaseAgent, CheckAgent
from dev_opsgpt.tools.base_tool import BaseTools, Tool
from dev_opsgpt.connector.schema import (
Memory, Role, Message, ActionStatus, ChainConfig,
load_role_configs
)
from dev_opsgpt.connector.message_process import MessageUtils
from dev_opsgpt.connector.configs.agent_config import AGETN_CONFIGS
role_configs = load_role_configs(AGETN_CONFIGS)
class BaseChain:
def __init__(
self,
chainConfig: ChainConfig,
agents: List[BaseAgent],
chat_turn: int = 1,
do_checker: bool = False,
# prompt_mamnger: PromptManager
) -> None:
self.chainConfig = chainConfig
self.agents = agents
self.chat_turn = chat_turn
self.do_checker = do_checker
self.checker = CheckAgent(role=role_configs["checker"].role,
task = None,
memory = None,
do_search = role_configs["checker"].do_search,
do_doc_retrieval = role_configs["checker"].do_doc_retrieval,
do_tool_retrieval = role_configs["checker"].do_tool_retrieval,
do_filter=False, do_use_self_memory=False)
self.messageUtils = MessageUtils()
# all memory created by agent until instance deleted
self.global_memory = Memory(messages=[])
def step(self, query: Message, history: Memory = None, background: Memory = None, memory_pool: Memory = None) -> Message:
'''execute chain'''
for output_message, local_memory in self.astep(query, history, background, memory_pool):
pass
return output_message, local_memory
def astep(self, query: Message, history: Memory = None, background: Memory = None, memory_pool: Memory = None) -> Message:
'''execute chain'''
local_memory = Memory(messages=[])
input_message = copy.deepcopy(query)
step_nums = copy.deepcopy(self.chat_turn)
check_message = None
self.global_memory.append(input_message)
# local_memory.append(input_message)
while step_nums > 0:
for agent in self.agents:
for output_message in agent.arun(input_message, history, background=background, memory_pool=memory_pool):
# logger.debug(f"local_memory {local_memory + output_message}")
yield output_message, local_memory + output_message
output_message = self.messageUtils.inherit_extrainfo(input_message, output_message)
# according the output to choose one action for code_content or tool_content
# logger.info(f"{agent.role.role_name} currenct message: {output_message.step_content}\n next llm question: {output_message.input_query}")
output_message = self.messageUtils.parser(output_message)
yield output_message, local_memory + output_message
# output_message = self.step_router(output_message)
input_message = output_message
self.global_memory.append(output_message)
local_memory.append(output_message)
# when get finished signal can stop early
if output_message.action_status == ActionStatus.FINISHED or output_message.action_status == ActionStatus.STOPED:
action_status = False
break
if output_message.action_status == ActionStatus.FINISHED:
break
if self.do_checker and self.chat_turn > 1:
# logger.debug(f"{self.checker.role.role_name} input global memory: {self.global_memory.to_str_messages(content_key='step_content', return_all=False)}")
for check_message in self.checker.arun(query, background=local_memory, memory_pool=memory_pool):
pass
check_message = self.messageUtils.parser(check_message)
check_message = self.messageUtils.filter(check_message)
check_message = self.messageUtils.inherit_extrainfo(output_message, check_message)
logger.debug(f"{self.checker.role.role_name}: {check_message.role_content}")
if check_message.action_status == ActionStatus.FINISHED:
self.global_memory.append(check_message)
break
step_nums -= 1
#
output_message = check_message or output_message # 返回chain和checker的结果
output_message.input_query = query.input_query # chain和chain之间消息通信不改变问题
yield output_message, local_memory
def get_memory(self, content_key="role_content") -> Memory:
memory = self.global_memory
return memory.to_tuple_messages(content_key=content_key)
def get_memory_str(self, content_key="role_content") -> Memory:
memory = self.global_memory
# for i in memory.to_tuple_messages(content_key=content_key):
# logger.debug(f"{i}")
return "\n".join([": ".join(i) for i in memory.to_tuple_messages(content_key=content_key)])
def get_agents_memory(self, content_key="role_content"):
return [agent.get_memory(content_key=content_key) for agent in self.agents]
def get_agents_memory_str(self, content_key="role_content"):
return "************".join([f"{agent.role.role_name}\n" + agent.get_memory_str(content_key=content_key) for agent in self.agents])

View File

@ -1,18 +0,0 @@
from typing import List
from loguru import logger
import copy
from dev_opsgpt.connector.agents import BaseAgent
from .base_chain import BaseChain
from dev_opsgpt.connector.agents import BaseAgent, CheckAgent
from dev_opsgpt.connector.schema import (
Memory, Role, Message, ActionStatus, ChainConfig,
load_role_configs
)
class ExecutorRefineChain(BaseChain):
def __init__(self, agents: List[BaseAgent], do_code_exec: bool = False) -> None:
super().__init__(agents, do_code_exec)

View File

@ -1,7 +0,0 @@
from .agent_config import AGETN_CONFIGS
from .chain_config import CHAIN_CONFIGS
from .phase_config import PHASE_CONFIGS
__all__ = [
"AGETN_CONFIGS", "CHAIN_CONFIGS", "PHASE_CONFIGS"
]

View File

@ -1,303 +0,0 @@
from enum import Enum
from .prompts import (
REACT_PROMPT_INPUT, CHECK_PROMPT_INPUT, EXECUTOR_PROMPT_INPUT, CONTEXT_PROMPT_INPUT, QUERY_CONTEXT_PROMPT_INPUT,PLAN_PROMPT_INPUT,
RECOGNIZE_INTENTION_PROMPT,
CHECKER_TEMPLATE_PROMPT,
CONV_SUMMARY_PROMPT,
QA_PROMPT, CODE_QA_PROMPT, QA_TEMPLATE_PROMPT,
EXECUTOR_TEMPLATE_PROMPT,
REFINE_TEMPLATE_PROMPT,
SELECTOR_AGENT_TEMPLATE_PROMPT,
PLANNER_TEMPLATE_PROMPT, GENERAL_PLANNER_PROMPT, DATA_PLANNER_PROMPT, TOOL_PLANNER_PROMPT,
PRD_WRITER_METAGPT_PROMPT, DESIGN_WRITER_METAGPT_PROMPT, TASK_WRITER_METAGPT_PROMPT, CODE_WRITER_METAGPT_PROMPT,
REACT_TEMPLATE_PROMPT,
REACT_TOOL_PROMPT, REACT_CODE_PROMPT, REACT_TOOL_AND_CODE_PLANNER_PROMPT, REACT_TOOL_AND_CODE_PROMPT
)
class AgentType:
REACT = "ReactAgent"
EXECUTOR = "ExecutorAgent"
ONE_STEP = "BaseAgent"
DEFAULT = "BaseAgent"
SELECTOR = "SelectorAgent"
AGETN_CONFIGS = {
"baseGroup": {
"role": {
"role_prompt": SELECTOR_AGENT_TEMPLATE_PROMPT,
"role_type": "assistant",
"role_name": "baseGroup",
"role_desc": "",
"agent_type": "SelectorAgent"
},
"group_agents": ["tool_react", "code_react"],
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"checker": {
"role": {
"role_prompt": CHECKER_TEMPLATE_PROMPT,
"role_type": "assistant",
"role_name": "checker",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"conv_summary": {
"role": {
"role_prompt": CONV_SUMMARY_PROMPT,
"role_type": "assistant",
"role_name": "conv_summary",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"general_planner": {
"role": {
"role_prompt": PLANNER_TEMPLATE_PROMPT,
"role_type": "assistant",
"role_name": "general_planner",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"executor": {
"role": {
"role_prompt": EXECUTOR_TEMPLATE_PROMPT,
"role_type": "assistant",
"role_name": "executor",
"role_desc": "",
"agent_type": "ExecutorAgent",
},
"stop": "\n**Observation:**",
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"base_refiner": {
"role": {
"role_prompt": REFINE_TEMPLATE_PROMPT,
"role_type": "assistant",
"role_name": "base_refiner",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"planner": {
"role": {
"role_prompt": DATA_PLANNER_PROMPT,
"role_type": "assistant",
"role_name": "planner",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"intention_recognizer": {
"role": {
"role_prompt": RECOGNIZE_INTENTION_PROMPT,
"role_type": "assistant",
"role_name": "intention_recognizer",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"tool_planner": {
"role": {
"role_prompt": TOOL_PLANNER_PROMPT,
"role_type": "assistant",
"role_name": "tool_planner",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"tool_and_code_react": {
"role": {
"role_prompt": REACT_TOOL_AND_CODE_PROMPT,
"role_type": "assistant",
"role_name": "tool_and_code_react",
"role_desc": "",
"agent_type": "ReactAgent",
},
"stop": "\n**Observation:**",
"chat_turn": 7,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"tool_and_code_planner": {
"role": {
"role_prompt": REACT_TOOL_AND_CODE_PLANNER_PROMPT,
"role_type": "assistant",
"role_name": "tool_and_code_planner",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"tool_react": {
"role": {
"role_prompt": REACT_TOOL_PROMPT,
"role_type": "assistant",
"role_name": "tool_react",
"role_desc": "",
"agent_type": "ReactAgent"
},
"chat_turn": 5,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False,
"stop": "\n**Observation:**"
},
"code_react": {
"role": {
"role_prompt": REACT_CODE_PROMPT,
"role_type": "assistant",
"role_name": "code_react",
"role_desc": "",
"agent_type": "ReactAgent"
},
"chat_turn": 5,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False,
"stop": "\n**Observation:**"
},
"qaer": {
"role": {
"role_prompt": QA_TEMPLATE_PROMPT,
"role_type": "assistant",
"role_name": "qaer",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"code_qaer": {
"role": {
"role_prompt": CODE_QA_PROMPT ,
"role_type": "assistant",
"role_name": "code_qaer",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": True,
"do_tool_retrieval": False
},
"searcher": {
"role": {
"role_prompt": QA_TEMPLATE_PROMPT,
"role_type": "assistant",
"role_name": "searcher",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": True,
"do_doc_retrieval": False,
"do_tool_retrieval": False
},
"metaGPT_PRD": {
"role": {
"role_prompt": PRD_WRITER_METAGPT_PROMPT,
"role_type": "assistant",
"role_name": "metaGPT_PRD",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False,
"focus_agents": [],
"focus_message_keys": [],
},
"metaGPT_DESIGN": {
"role": {
"role_prompt": DESIGN_WRITER_METAGPT_PROMPT,
"role_type": "assistant",
"role_name": "metaGPT_DESIGN",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False,
"focus_agents": ["metaGPT_PRD"],
"focus_message_keys": [],
},
"metaGPT_TASK": {
"role": {
"role_prompt": TASK_WRITER_METAGPT_PROMPT,
"role_type": "assistant",
"role_name": "metaGPT_TASK",
"role_desc": "",
"agent_type": "BaseAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False,
"focus_agents": ["metaGPT_DESIGN"],
"focus_message_keys": [],
},
"metaGPT_CODER": {
"role": {
"role_prompt": CODE_WRITER_METAGPT_PROMPT,
"role_type": "assistant",
"role_name": "metaGPT_CODER",
"role_desc": "",
"agent_type": "ExecutorAgent"
},
"chat_turn": 1,
"do_search": False,
"do_doc_retrieval": False,
"do_tool_retrieval": False,
"focus_agents": ["metaGPT_DESIGN", "metaGPT_TASK"],
"focus_message_keys": [],
},
}

View File

@ -1,99 +0,0 @@
You are a Architect, named Bob, your goal is Design a concise, usable, complete python system, and the constraint is Try to specify good open source tools as much as possible.
# Context
## Original Requirements:
Create a snake game.
## Product Goals:
Develop a highly addictive and engaging snake game.
Provide a user-friendly and intuitive user interface.
Implement various levels and challenges to keep the players entertained.
## User Stories:
As a user, I want to be able to control the snake's movement using arrow keys or touch gestures.
As a user, I want to see my score and progress displayed on the screen.
As a user, I want to be able to pause and resume the game at any time.
As a user, I want to be challenged with different obstacles and levels as I progress.
As a user, I want to have the option to compete with other players and compare my scores.
## Competitive Analysis:
Python Snake Game: A simple snake game implemented in Python with basic features and limited levels.
Snake.io: A multiplayer online snake game with competitive gameplay and high engagement.
Slither.io: Another multiplayer online snake game with a larger player base and addictive gameplay.
Snake Zone: A mobile snake game with various power-ups and challenges.
Snake Mania: A classic snake game with modern graphics and smooth controls.
Snake Rush: A fast-paced snake game with time-limited challenges.
Snake Master: A snake game with unique themes and customizable snakes.
## Requirement Analysis:
The product should be a highly addictive and engaging snake game with a user-friendly interface. It should provide various levels and challenges to keep the players entertained. The game should have smooth controls and allow the users to compete with each other.
## Requirement Pool:
```
[
["Implement different levels with increasing difficulty", "P0"],
["Allow users to control the snake using arrow keys or touch gestures", "P0"],
["Display the score and progress on the screen", "P1"],
["Provide an option to pause and resume the game", "P1"],
["Integrate leaderboards to enable competition among players", "P2"]
]
```
## UI Design draft:
The game will have a simple and clean interface. The main screen will display the snake, obstacles, and the score. The snake's movement can be controlled using arrow keys or touch gestures. There will be buttons to pause and resume the game. The level and difficulty will be indicated on the screen. The design will have a modern and visually appealing style with smooth animations.
## Anything UNCLEAR:
There are no unclear points.
## Format example
---
## Implementation approach
We will ...
## Python package name
```python
"snake_game"
```
## File list
```python
[
"main.py",
]
```
## Data structures and interface definitions
```mermaid
classDiagram
class Game{
+int score
}
...
Game "1" -- "1" Food: has
```
## Program call flow
```mermaid
sequenceDiagram
participant M as Main
...
G->>M: end game
```
## Anything UNCLEAR
The requirement is clear to me.
---
-----
Role: You are an architect; the goal is to design a SOTA PEP8-compliant python system; make the best use of good open source tools
Requirement: Fill in the following missing information based on the context, note that all sections are response with code form separately
Max Output: 8192 chars or 2048 tokens. Try to use them up.
Attention: Use '##' to split sections, not '#', and '## <SECTION_NAME>' SHOULD WRITE BEFORE the code and triple quote.
## Implementation approach: Provide as Plain text. Analyze the difficult points of the requirements, select the appropriate open-source framework.
## Python package name: Provide as Python str with python triple quoto, concise and clear, characters only use a combination of all lowercase and underscores
## File list: Provided as Python list[str], the list of ONLY REQUIRED files needed to write the program(LESS IS MORE!). Only need relative paths, comply with PEP8 standards. ALWAYS write a main.py or app.py here
## Data structures and interface definitions: Use mermaid classDiagram code syntax, including classes (INCLUDING __init__ method) and functions (with type annotations), CLEARLY MARK the RELATIONSHIPS between classes, and comply with PEP8 standards. The data structures SHOULD BE VERY DETAILED and the API should be comprehensive with a complete design.
## Program call flow: Use sequenceDiagram code syntax, COMPLETE and VERY DETAILED, using CLASSES AND API DEFINED ABOVE accurately, covering the CRUD AND INIT of each object, SYNTAX MUST BE CORRECT.
## Anything UNCLEAR: Provide as Plain text. Make clear here.

View File

@ -1,101 +0,0 @@
You are a Product Manager, named Alice, your goal is Efficiently create a successful product, and the constraint is .
# Context
## Original Requirements
Create a snake game.
## Search Information
### Search Results
### Search Summary
## mermaid quadrantChart code syntax example. DONT USE QUOTO IN CODE DUE TO INVALID SYNTAX. Replace the <Campain X> with REAL COMPETITOR NAME
```mermaid
quadrantChart
title Reach and engagement of campaigns
x-axis Low Reach --> High Reach
y-axis Low Engagement --> High Engagement
quadrant-1 We should expand
quadrant-2 Need to promote
quadrant-3 Re-evaluate
quadrant-4 May be improved
"Campaign: A": [0.3, 0.6]
"Campaign B": [0.45, 0.23]
"Campaign C": [0.57, 0.69]
"Campaign D": [0.78, 0.34]
"Campaign E": [0.40, 0.34]
"Campaign F": [0.35, 0.78]
"Our Target Product": [0.5, 0.6]
```
## Format example
---
## Original Requirements
The boss ...
## Product Goals
```python
[
"Create a ...",
]
```
## User Stories
```python
[
"As a user, ...",
]
```
## Competitive Analysis
```python
[
"Python Snake Game: ...",
]
```
## Competitive Quadrant Chart
```mermaid
quadrantChart
title Reach and engagement of campaigns
...
"Our Target Product": [0.6, 0.7]
```
## Requirement Analysis
The product should be a ...
## Requirement Pool
```python
[
["End game ...", "P0"]
]
```
## UI Design draft
Give a basic function description, and a draft
## Anything UNCLEAR
There are no unclear points.
---
-----
Role: You are a professional product manager; the goal is to design a concise, usable, efficient product
Requirements: According to the context, fill in the following missing information, note that each sections are returned in Python code triple quote form seperatedly. If the requirements are unclear, ensure minimum viability and avoid excessive design
ATTENTION: Use '##' to SPLIT SECTIONS, not '#'. AND '## <SECTION_NAME>' SHOULD WRITE BEFORE the code and triple quote. Output carefully referenced "Format example" in format.
## Original Requirements: Provide as Plain text, place the polished complete original requirements here
## Product Goals: Provided as Python list[str], up to 3 clear, orthogonal product goals. If the requirement itself is simple, the goal should also be simple
## User Stories: Provided as Python list[str], up to 5 scenario-based user stories, If the requirement itself is simple, the user stories should also be less
## Competitive Analysis: Provided as Python list[str], up to 7 competitive product analyses, consider as similar competitors as possible
## Competitive Quadrant Chart: Use mermaid quadrantChart code syntax. up to 14 competitive products. Translation: Distribute these competitor scores evenly between 0 and 1, trying to conform to a normal distribution centered around 0.5 as much as possible.
## Requirement Analysis: Provide as Plain text. Be simple. LESS IS MORE. Make your requirements less dumb. Delete the parts unnessasery.
## Requirement Pool: Provided as Python list[list[str], the parameters are requirement description, priority(P0/P1/P2), respectively, comply with PEP standards; no more than 5 requirements and consider to make its difficulty lower
## UI Design draft: Provide as Plain text. Be simple. Describe the elements and functions, also provide a simple style description and layout description.
## Anything UNCLEAR: Provide as Plain text. Make clear here.

View File

@ -1,177 +0,0 @@
NOTICE
Role: You are a professional software engineer, and your main task is to review the code. You need to ensure that the code conforms to the PEP8 standards, is elegantly designed and modularized, easy to read and maintain, and is written in Python 3.9 (or in another programming language).
ATTENTION: Use '##' to SPLIT SECTIONS, not '#'. Output format carefully referenced "Format example".
## Code Review: Based on the following context and code, and following the check list, Provide key, clear, concise, and specific code modification suggestions, up to 5.
```
1. Check 0: Is the code implemented as per the requirements?
2. Check 1: Are there any issues with the code logic?
3. Check 2: Does the existing code follow the "Data structures and interface definitions"?
4. Check 3: Is there a function in the code that is omitted or not fully implemented that needs to be implemented?
5. Check 4: Does the code have unnecessary or lack dependencies?
```
## Rewrite Code: point.py Base on "Code Review" and the source code, rewrite code with triple quotes. Do your utmost to optimize THIS SINGLE FILE.
-----
# Context
## Implementation approach
For the snake game, we can use the Pygame library, which is an open-source and easy-to-use library for game development in Python. Pygame provides a simple and efficient way to handle graphics, sound, and user input, making it suitable for developing a snake game.
## Python package name
```
"snake_game"
```
## File list
````
[
"main.py",
]
```
## Data structures and interface definitions
```
classDiagram
class Game:
-int score
-bool paused
+__init__()
+start_game()
+handle_input(key: int)
+update_game()
+draw_game()
+game_over()
class Snake:
-list[Point] body
-Point dir
-bool alive
+__init__(start_pos: Point)
+move()
+change_direction(dir: Point)
+grow()
+get_head() -> Point
+get_body() -> list[Point]
+is_alive() -> bool
class Point:
-int x
-int y
+__init__(x: int, y: int)
+set_coordinate(x: int, y: int)
+get_coordinate() -> tuple[int, int]
class Food:
-Point pos
-bool active
+__init__()
+generate_new_food()
+get_position() -> Point
+is_active() -> bool
Game "1" -- "1" Snake: contains
Game "1" -- "1" Food: has
```
## Program call flow
```
sequenceDiagram
participant M as Main
participant G as Game
participant S as Snake
participant F as Food
M->>G: Start game
G->>G: Initialize game
loop
M->>G: Handle user input
G->>S: Handle input
G->>F: Check if snake eats food
G->>S: Update snake movement
G->>G: Check game over condition
G->>G: Update score
G->>G: Draw game
M->>G: Update display
end
G->>G: Game over
```
## Required Python third-party packages
```
"""
pygame==2.0.1
"""
```
## Required Other language third-party packages
```
"""
No third-party packages required for other languages.
"""
```
## Logic Analysis
```
[
["main.py", "Main"],
["game.py", "Game"],
["snake.py", "Snake"],
["point.py", "Point"],
["food.py", "Food"]
]
```
## Task list
```
[
"point.py",
"food.py",
"snake.py",
"game.py",
"main.py"
]
```
## Shared Knowledge
```
"""
The 'point.py' module contains the implementation of the Point class, which represents a point in a 2D coordinate system.
The 'food.py' module contains the implementation of the Food class, which represents the food in the game.
The 'snake.py' module contains the implementation of the Snake class, which represents the snake in the game.
The 'game.py' module contains the implementation of the Game class, which manages the game logic.
The 'main.py' module is the entry point of the application and starts the game.
"""
```
## Anything UNCLEAR
We need to clarify the main entry point of the application and ensure that all required third-party libraries are properly initialized.
## Code: point.py
```
class Point:
def __init__(self, x: int, y: int):
self.x = x
self.y = y
def set_coordinate(self, x: int, y: int):
self.x = x
self.y = y
def get_coordinate(self) -> tuple[int, int]:
return self.x, self.y
```
-----
## Format example
-----
## Code Review
1. The code ...
2. ...
3. ...
4. ...
5. ...
## Rewrite Code: point.py
```python
## point.py
...
```
-----

View File

@ -1,148 +0,0 @@
You are a Project Manager, named Eve, your goal isImprove team efficiency and deliver with quality and quantity, and the constraint is .
# Context
## Implementation approach
For the snake game, we can use the Pygame library, which is an open-source and easy-to-use library for game development in Python. Pygame provides a simple and efficient way to handle graphics, sound, and user input, making it suitable for developing a snake game.
## Python package name
```
"snake_game"
```
## File list
````
[
"main.py",
"game.py",
"snake.py",
"food.py"
]
```
## Data structures and interface definitions
```
classDiagram
class Game{
-int score
-bool game_over
+start_game() : void
+end_game() : void
+update() : void
+draw() : void
+handle_events() : void
}
class Snake{
-list[Tuple[int, int]] body
-Tuple[int, int] direction
+move() : void
+change_direction(direction: Tuple[int, int]) : void
+is_collision() : bool
+grow() : void
+draw() : void
}
class Food{
-Tuple[int, int] position
+generate() : void
+draw() : void
}
class Main{
-Game game
+run() : void
}
Game "1" -- "1" Snake: contains
Game "1" -- "1" Food: has
Main "1" -- "1" Game: has
```
## Program call flow
```
sequenceDiagram
participant M as Main
participant G as Game
participant S as Snake
participant F as Food
M->G: run()
G->G: start_game()
G->G: handle_events()
G->G: update()
G->G: draw()
G->G: end_game()
G->S: move()
S->S: change_direction()
S->S: is_collision()
S->S: grow()
S->S: draw()
G->F: generate()
F->F: draw()
```
## Anything UNCLEAR
The design and implementation of the snake game are clear based on the given requirements.
## Format example
---
## Required Python third-party packages
```python
"""
flask==1.1.2
bcrypt==3.2.0
"""
```
## Required Other language third-party packages
```python
"""
No third-party ...
"""
```
## Full API spec
```python
"""
openapi: 3.0.0
...
description: A JSON object ...
"""
```
## Logic Analysis
```python
[
["game.py", "Contains ..."],
]
```
## Task list
```python
[
"game.py",
]
```
## Shared Knowledge
```python
"""
'game.py' contains ...
"""
```
## Anything UNCLEAR
We need ... how to start.
---
-----
Role: You are a project manager; the goal is to break down tasks according to PRD/technical design, give a task list, and analyze task dependencies to start with the prerequisite modules
Requirements: Based on the context, fill in the following missing information, note that all sections are returned in Python code triple quote form seperatedly. Here the granularity of the task is a file, if there are any missing files, you can supplement them
Attention: Use '##' to split sections, not '#', and '## <SECTION_NAME>' SHOULD WRITE BEFORE the code and triple quote.
## Required Python third-party packages: Provided in requirements.txt format
## Required Other language third-party packages: Provided in requirements.txt format
## Full API spec: Use OpenAPI 3.0. Describe all APIs that may be used by both frontend and backend.
## Logic Analysis: Provided as a Python list[list[str]. the first is filename, the second is class/method/function should be implemented in this file. Analyze the dependencies between the files, which work should be done first
## Task list: Provided as Python list[str]. Each str is a filename, the more at the beginning, the more it is a prerequisite dependency, should be done first
## Shared Knowledge: Anything that should be public like utils' functions, config's variables details that should make clear first.
## Anything UNCLEAR: Provide as Plain text. Make clear here. For example, don't forget a main entry. don't forget to init 3rd party libs.

View File

@ -1,147 +0,0 @@
NOTICE
Role: You are a professional engineer; the main goal is to write PEP8 compliant, elegant, modular, easy to read and maintain Python 3.9 code (but you can also use other programming language)
ATTENTION: Use '##' to SPLIT SECTIONS, not '#'. Output format carefully referenced "Format example".
## Code: snake.py Write code with triple quoto, based on the following list and context.
1. Do your best to implement THIS ONLY ONE FILE. ONLY USE EXISTING API. IF NO API, IMPLEMENT IT.
2. Requirement: Based on the context, implement one following code file, note to return only in code form, your code will be part of the entire project, so please implement complete, reliable, reusable code snippets
3. Attention1: If there is any setting, ALWAYS SET A DEFAULT VALUE, ALWAYS USE STRONG TYPE AND EXPLICIT VARIABLE.
4. Attention2: YOU MUST FOLLOW "Data structures and interface definitions". DONT CHANGE ANY DESIGN.
5. Think before writing: What should be implemented and provided in this document?
6. CAREFULLY CHECK THAT YOU DONT MISS ANY NECESSARY CLASS/FUNCTION IN THIS FILE.
7. Do not use public member functions that do not exist in your design.
-----
# Context
## Implementation approach
For the snake game, we can use the Pygame library, which is an open-source and easy-to-use library for game development in Python. Pygame provides a simple and efficient way to handle graphics, sound, and user input, making it suitable for developing a snake game.
## Python package name
```
"snake_game"
```
## File list
````
[
"main.py",
"game.py",
"snake.py",
"food.py"
]
```
## Data structures and interface definitions
```
classDiagram
class Game{
-int score
-bool game_over
+start_game() : void
+end_game() : void
+update() : void
+draw() : void
+handle_events() : void
}
class Snake{
-list[Tuple[int, int]] body
-Tuple[int, int] direction
+move() : void
+change_direction(direction: Tuple[int, int]) : void
+is_collision() : bool
+grow() : void
+draw() : void
}
class Food{
-Tuple[int, int] position
+generate() : void
+draw() : void
}
class Main{
-Game game
+run() : void
}
Game "1" -- "1" Snake: contains
Game "1" -- "1" Food: has
Main "1" -- "1" Game: has
```
## Program call flow
```
sequenceDiagram
participant M as Main
participant G as Game
participant S as Snake
participant F as Food
M->G: run()
G->G: start_game()
G->G: handle_events()
G->G: update()
G->G: draw()
G->G: end_game()
G->S: move()
S->S: change_direction()
S->S: is_collision()
S->S: grow()
S->S: draw()
G->F: generate()
F->F: draw()
```
## Anything UNCLEAR
The design and implementation of the snake game are clear based on the given requirements.
## Required Python third-party packages
```
"""
pygame==2.0.1
"""
```
## Required Other language third-party packages
```
"""
No third-party packages required for other languages.
"""
```
## Logic Analysis
```
[
["main.py", "Main"],
["game.py", "Game"],
["snake.py", "Snake"],
["food.py", "Food"]
]
```
## Task list
```
[
"snake.py",
"food.py",
"game.py",
"main.py"
]
```
## Shared Knowledge
```
"""
'game.py' contains the main logic for the snake game, including starting the game, handling user input, updating the game state, and drawing the game state.
'snake.py' contains the logic for the snake, including moving the snake, changing its direction, checking for collisions, growing the snake, and drawing the snake.
'food.py' contains the logic for the food, including generating a new food position and drawing the food.
'main.py' initializes the game and runs the game loop.
"""
```
## Anything UNCLEAR
We need to clarify the main entry point of the application and ensure that all required third-party libraries are properly initialized.
-----
## Format example
-----
## Code: snake.py
```python
## snake.py
...
```
-----

View File

@ -1,127 +0,0 @@
from enum import Enum
# from .prompts import PLANNER_TEMPLATE_PROMPT
CHAIN_CONFIGS = {
"chatChain": {
"chain_name": "chatChain",
"chain_type": "BaseChain",
"agents": ["qaer"],
"chat_turn": 1,
"do_checker": False,
"chain_prompt": ""
},
"docChatChain": {
"chain_name": "docChatChain",
"chain_type": "BaseChain",
"agents": ["qaer"],
"chat_turn": 1,
"do_checker": False,
"chain_prompt": ""
},
"searchChatChain": {
"chain_name": "searchChatChain",
"chain_type": "BaseChain",
"agents": ["searcher"],
"chat_turn": 1,
"do_checker": False,
"chain_prompt": ""
},
"codeChatChain": {
"chain_name": "codehChatChain",
"chain_type": "BaseChain",
"agents": ["code_qaer"],
"chat_turn": 1,
"do_checker": False,
"chain_prompt": ""
},
"toolReactChain": {
"chain_name": "toolReactChain",
"chain_type": "BaseChain",
"agents": ["tool_planner", "tool_react"],
"chat_turn": 2,
"do_checker": True,
"chain_prompt": ""
},
"codePlannerChain": {
"chain_name": "codePlannerChain",
"chain_type": "BaseChain",
"agents": ["planner"],
"chat_turn": 1,
"do_checker": True,
"chain_prompt": ""
},
"codeReactChain": {
"chain_name": "codeReactChain",
"chain_type": "BaseChain",
"agents": ["code_react"],
"chat_turn": 6,
"do_checker": True,
"chain_prompt": ""
},
"codeToolPlanChain": {
"chain_name": "codeToolPlanChain",
"chain_type": "BaseChain",
"agents": ["tool_and_code_planner"],
"chat_turn": 1,
"do_checker": False,
"chain_prompt": ""
},
"codeToolReactChain": {
"chain_name": "codeToolReactChain",
"chain_type": "BaseChain",
"agents": ["tool_and_code_react"],
"chat_turn": 3,
"do_checker": True,
"chain_prompt": ""
},
"planChain": {
"chain_name": "planChain",
"chain_type": "BaseChain",
"agents": ["general_planner"],
"chat_turn": 1,
"do_checker": False,
"chain_prompt": ""
},
"executorChain": {
"chain_name": "executorChain",
"chain_type": "BaseChain",
"agents": ["executor"],
"chat_turn": 1,
"do_checker": True,
"chain_prompt": ""
},
"executorRefineChain": {
"chain_name": "executorRefineChain",
"chain_type": "BaseChain",
"agents": ["executor", "base_refiner"],
"chat_turn": 3,
"do_checker": True,
"chain_prompt": ""
},
"metagptChain": {
"chain_name": "metagptChain",
"chain_type": "BaseChain",
"agents": ["metaGPT_PRD", "metaGPT_DESIGN", "metaGPT_TASK", "metaGPT_CODER"],
"chat_turn": 1,
"do_checker": False,
"chain_prompt": ""
},
"baseGroupChain": {
"chain_name": "baseGroupChain",
"chain_type": "BaseChain",
"agents": ["baseGroup"],
"chat_turn": 1,
"do_checker": False,
"chain_prompt": ""
},
"codeChatXXChain": {
"chain_name": "codeChatXXChain",
"chain_type": "BaseChain",
"agents": ["codeChat1", "codeChat2"],
"chat_turn": 1,
"do_checker": False,
"chain_prompt": ""
}
}

View File

@ -1,113 +0,0 @@
PHASE_CONFIGS = {
"chatPhase": {
"phase_name": "chatPhase",
"phase_type": "BasePhase",
"chains": ["chatChain"],
"do_summary": False,
"do_search": False,
"do_doc_retrieval": False,
"do_code_retrieval": False,
"do_tool_retrieval": False,
"do_using_tool": False
},
"docChatPhase": {
"phase_name": "docChatPhase",
"phase_type": "BasePhase",
"chains": ["docChatChain"],
"do_summary": False,
"do_search": False,
"do_doc_retrieval": True,
"do_code_retrieval": False,
"do_tool_retrieval": False,
"do_using_tool": False
},
"searchChatPhase": {
"phase_name": "searchChatPhase",
"phase_type": "BasePhase",
"chains": ["searchChatChain"],
"do_summary": False,
"do_search": True,
"do_doc_retrieval": False,
"do_code_retrieval": False,
"do_tool_retrieval": False,
"do_using_tool": False
},
"codeChatPhase": {
"phase_name": "codeChatPhase",
"phase_type": "BasePhase",
"chains": ["codeChatChain"],
"do_summary": False,
"do_search": False,
"do_doc_retrieval": False,
"do_code_retrieval": True,
"do_tool_retrieval": False,
"do_using_tool": False
},
"toolReactPhase": {
"phase_name": "toolReactPhase",
"phase_type": "BasePhase",
"chains": ["toolReactChain"],
"do_summary": False,
"do_search": False,
"do_doc_retrieval": False,
"do_code_retrieval": False,
"do_tool_retrieval": False,
"do_using_tool": True
},
"codeReactPhase": {
"phase_name": "codeReactPhase",
"phase_type": "BasePhase",
# "chains": ["codePlannerChain", "codeReactChain"],
"chains": ["planChain", "codeReactChain"],
"do_summary": False,
"do_search": False,
"do_doc_retrieval": False,
"do_code_retrieval": False,
"do_tool_retrieval": False,
"do_using_tool": False
},
"codeToolReactPhase": {
"phase_name": "codeToolReactPhase",
"phase_type": "BasePhase",
"chains": ["codeToolPlanChain", "codeToolReactChain"],
"do_summary": False,
"do_search": False,
"do_doc_retrieval": False,
"do_code_retrieval": False,
"do_tool_retrieval": False,
"do_using_tool": True
},
"baseTaskPhase": {
"phase_name": "baseTaskPhase",
"phase_type": "BasePhase",
"chains": ["planChain", "executorChain"],
"do_summary": False,
"do_search": False,
"do_doc_retrieval": False,
"do_code_retrieval": False,
"do_tool_retrieval": False,
"do_using_tool": False
},
# "metagpt_code_devlop": {
# "phase_name": "metagpt_code_devlop",
# "phase_type": "BasePhase",
# "chains": ["metagptChain",],
# "do_summary": False,
# "do_search": False,
# "do_doc_retrieval": False,
# "do_code_retrieval": False,
# "do_tool_retrieval": False,
# "do_using_tool": False
# },
# "baseGroupPhase": {
# "phase_name": "baseGroupPhase",
# "phase_type": "BasePhase",
# "chains": ["baseGroupChain"],
# "do_summary": False,
# "do_search": False,
# "do_doc_retrieval": False,
# "do_code_retrieval": False,
# "do_tool_retrieval": False,
# "do_using_tool": False
# },
}

View File

@ -1,41 +0,0 @@
from .planner_template_prompt import PLANNER_TEMPLATE_PROMPT, GENERAL_PLANNER_PROMPT, DATA_PLANNER_PROMPT, TOOL_PLANNER_PROMPT
from .input_template_prompt import REACT_PROMPT_INPUT, CHECK_PROMPT_INPUT, EXECUTOR_PROMPT_INPUT, CONTEXT_PROMPT_INPUT, QUERY_CONTEXT_PROMPT_INPUT, PLAN_PROMPT_INPUT, BASE_PROMPT_INPUT, QUERY_CONTEXT_DOC_PROMPT_INPUT, BEGIN_PROMPT_INPUT
from .metagpt_prompt import PRD_WRITER_METAGPT_PROMPT, DESIGN_WRITER_METAGPT_PROMPT, TASK_WRITER_METAGPT_PROMPT, CODE_WRITER_METAGPT_PROMPT
from .intention_template_prompt import RECOGNIZE_INTENTION_PROMPT
from .checker_template_prompt import CHECKER_PROMPT, CHECKER_TEMPLATE_PROMPT
from .summary_template_prompt import CONV_SUMMARY_PROMPT
from .qa_template_prompt import QA_PROMPT, CODE_QA_PROMPT, QA_TEMPLATE_PROMPT
from .executor_template_prompt import EXECUTOR_TEMPLATE_PROMPT
from .refine_template_prompt import REFINE_TEMPLATE_PROMPT
from .agent_selector_template_prompt import SELECTOR_AGENT_TEMPLATE_PROMPT
from .react_template_prompt import REACT_TEMPLATE_PROMPT
from .react_code_prompt import REACT_CODE_PROMPT
from .react_tool_prompt import REACT_TOOL_PROMPT
from .react_tool_code_prompt import REACT_TOOL_AND_CODE_PROMPT
from .react_tool_code_planner_prompt import REACT_TOOL_AND_CODE_PLANNER_PROMPT
__all__ = [
"REACT_PROMPT_INPUT", "CHECK_PROMPT_INPUT", "EXECUTOR_PROMPT_INPUT", "CONTEXT_PROMPT_INPUT", "QUERY_CONTEXT_PROMPT_INPUT", "PLAN_PROMPT_INPUT", "BASE_PROMPT_INPUT", "QUERY_CONTEXT_DOC_PROMPT_INPUT", "BEGIN_PROMPT_INPUT",
"RECOGNIZE_INTENTION_PROMPT",
"PRD_WRITER_METAGPT_PROMPT", "DESIGN_WRITER_METAGPT_PROMPT", "TASK_WRITER_METAGPT_PROMPT", "CODE_WRITER_METAGPT_PROMPT",
"CHECKER_PROMPT", "CHECKER_TEMPLATE_PROMPT",
"CONV_SUMMARY_PROMPT",
"QA_PROMPT", "CODE_QA_PROMPT", "QA_TEMPLATE_PROMPT",
"EXECUTOR_TEMPLATE_PROMPT",
"REFINE_TEMPLATE_PROMPT",
"SELECTOR_AGENT_TEMPLATE_PROMPT",
"PLANNER_TEMPLATE_PROMPT", "GENERAL_PLANNER_PROMPT", "DATA_PLANNER_PROMPT", "TOOL_PLANNER_PROMPT",
"REACT_TEMPLATE_PROMPT",
"REACT_CODE_PROMPT", "REACT_TOOL_PROMPT", "REACT_TOOL_AND_CODE_PROMPT", "REACT_TOOL_AND_CODE_PLANNER_PROMPT"
]

View File

@ -1,24 +0,0 @@
SELECTOR_AGENT_TEMPLATE_PROMPT = """#### Role Selector Assistance Guidance
Your goal is to match the user's initial Origin Query) with the role that will best facilitate a solution, taking into account all relevant context (Context) provided.
When you need to select the appropriate role for handling a user's query, carefully read the provided role names, role descriptions and tool list.
You can use these tools:\n{formatted_tools}
Please ensure your selection is one of the listed roles. Available roles for selection:
{agents}
#### Input Format
**Origin Query:** the initial question or objective that the user wanted to achieve
**Context:** the context history to determine if Origin Query has been achieved.
#### Response Output Format
**Thoughts:** think the reason of selecting the role step by step
**Role:** Select the role name. such as {agent_names}
"""

View File

@ -1,37 +0,0 @@
CHECKER_TEMPLATE_PROMPT = """#### Checker Assistance Guidance
When users have completed a sequence of tasks or if there is clear evidence that no further actions are required, your role is to confirm the completion.
Your task is to assess the current situation based on the context and determine whether all objectives have been met.
Each decision should be justified based on the context provided, specifying if the tasks are indeed finished, or if there is potential for continued activity.
#### Input Format
**Origin Query:** the initial question or objective that the user wanted to achieve
**Context:** the current status and history of the tasks to determine if Origin Query has been achieved.
#### Response Output Format
**Action Status:** Set to 'finished' or 'continued'.
If it's 'finished', the context can answer the origin query.
If it's 'continued', the context cant answer the origin query.
**REASON:** Justify the decision of choosing 'finished' and 'continued' by evaluating the progress step by step.
Consider all relevant information. If the tasks were aimed at an ongoing process, assess whether it has reached a satisfactory conclusion.
"""
CHECKER_PROMPT = """尽可能地以有帮助和准确的方式回应人类,判断问题是否得到解答,同时展现解答的过程和内容。
用户的问题{query}
使用 JSON Blob 来指定一个返回的内容提供一个 action行动
有效的 'action' 值为'finished'(任务已经完成或是需要用户提供额外信息的输入) or 'continue' 历史记录的信息还不足以回答问题
在每个 $JSON_BLOB 中仅提供一个 action如下所示
```
{{'content': '提取“背景信息”和“对话信息”中信息来回答问题', 'reason': '解释$ACTION的原因', 'action': $ACTION}}
```
按照以下格式进行回应
问题输入问题以回答
行动
```
$JSON_BLOB
```
"""

View File

@ -1,33 +0,0 @@
EXECUTOR_TEMPLATE_PROMPT = """#### Writing Code Assistance Guidance
When users need help with coding, your role is to provide precise and effective guidance.
Write the code step by step, showing only the part necessary to solve the current problem.
Each reply should contain only the code required for the current step.
#### Response Process
**Question:** First, clarify the problem to be solved.
**Thoughts:** Based on the question and observations above, provide the plan for executing this step.
**Action Status:** Set to 'stoped' or 'code_executing'. If it's 'stoped', the next action is to provide the final answer to the original question. If it's 'code_executing', the next step is to write the code.
**Action:** Code according to your thoughts. Use this format for code:
```python
# Write your code here
```
**Observation:** Check the results and effects of the executed code.
... (Repeat this Question/Thoughts/Action/Observation cycle as needed)
**Thoughts:** I now know the final answer
**Action Status:** Set to 'stoped'
**Action:** The final answer to the original input question
"""

View File

@ -1,40 +0,0 @@
BASE_PROMPT_INPUT = '''#### Begin!!!
'''
PLAN_PROMPT_INPUT = '''#### Begin!!!
**Question:** {query}
'''
REACT_PROMPT_INPUT = '''#### Begin!!!
{query}
'''
CONTEXT_PROMPT_INPUT = '''#### Begin!!!
**Context:** {context}
'''
QUERY_CONTEXT_DOC_PROMPT_INPUT = '''#### Begin!!!
**Origin Query:** {query}
**Context:** {context}
**DocInfos:** {DocInfos}
'''
QUERY_CONTEXT_PROMPT_INPUT = '''#### Begin!!!
**Origin Query:** {query}
**Context:** {context}
'''
EXECUTOR_PROMPT_INPUT = '''#### Begin!!!
{query}
'''
BEGIN_PROMPT_INPUT = '''#### Begin!!!
'''
CHECK_PROMPT_INPUT = '''下面是用户的原始问题:{query}'''

View File

@ -1,14 +0,0 @@
RECOGNIZE_INTENTION_PROMPT = """你是一个任务决策助手,能够将理解用户意图并决策采取最合适的行动,尽可能地以有帮助和准确的方式回应人类,
使用 JSON Blob 来指定一个返回的内容提供一个 action行动
有效的 'action' 值为'planning'(需要先进行拆解计划) or 'only_answer' 不需要拆解问题即可直接回答问题or "tool_using" (使用工具来回答问题) or 'coding'(生成可执行的代码)
在每个 $JSON_BLOB 中仅提供一个 action如下所示
```
{{'action': $ACTION}}
```
按照以下格式进行回应
问题输入问题以回答
行动$ACTION
```
$JSON_BLOB
```
"""

View File

@ -1,218 +0,0 @@
PRD_WRITER_METAGPT_PROMPT = """#### PRD Writer Assistance Guidance
You are a professional Product Manager, your goal is to design a concise, usable, efficient product.
According to the context, fill in the following missing information, note that each sections are returned in Python code triple quote form seperatedly.
If the Origin Query are unclear, ensure minimum viability and avoid excessive design.
ATTENTION: response carefully referenced "Response Output Format" in format.
#### Input Format
**Origin Query:** the initial question or objective that the user wanted to achieve
**Context:** the current status and history of the tasks to determine if Origin Query has been achieved.
#### Response Output Format
**Original Requirements:**
The boss ...
**Product Goals:**
```python
[
"Create a ...",
]
```
**User Stories:**
```python
[
"As a user, ...",
]
```
**Competitive Analysis:**
```python
[
"Python Snake Game: ...",
]
```
**Requirement Analysis:**
The product should be a ...
**Requirement Pool:**
```python
[
["End game ...", "P0"]
]
```
**UI Design draft:**
Give a basic function description, and a draft
**Anything UNCLEAR:**
There are no unclear points.'''
"""
DESIGN_WRITER_METAGPT_PROMPT = """#### PRD Writer Assistance Guidance
You are an architect; the goal is to design a SOTA PEP8-compliant python system; make the best use of good open source tools.
Fill in the following missing information based on the context, note that all sections are response with code form separately.
8192 chars or 2048 tokens. Try to use them up.
ATTENTION: response carefully referenced "Response Format" in format.
#### Input Format
**Origin Query:** the initial question or objective that the user wanted to achieve
**Context:** the current status and history of the tasks to determine if Origin Query has been achieved.
#### Response Format
**Implementation approach:**
Provide as Plain text. Analyze the difficult points of the requirements, select the appropriate open-source framework.
**Python package name:**
Provide as Python str with python triple quoto, concise and clear, characters only use a combination of all lowercase and underscores
```python
"snake_game"
```
**File list:**
Provided as Python list[str], the list of ONLY REQUIRED files needed to write the program(LESS IS MORE!). Only need relative paths, comply with PEP8 standards. ALWAYS write a main.py or app.py here
```python
[
"main.py",
...
]
```
**Data structures and interface definitions:**
Use mermaid classDiagram code syntax, including classes (INCLUDING __init__ method) and functions (with type annotations),
CLEARLY MARK the RELATIONSHIPS between classes, and comply with PEP8 standards. The data structures SHOULD BE VERY DETAILED and the API should be comprehensive with a complete design.
```mermaid
classDiagram
class Game {{
+int score
}}
...
Game "1" -- "1" Food: has
```
**Program call flow:**
Use sequenceDiagram code syntax, COMPLETE and VERY DETAILED, using CLASSES AND API DEFINED ABOVE accurately, covering the CRUD AND INIT of each object, SYNTAX MUST BE CORRECT.
```mermaid
sequenceDiagram
participant M as Main
...
G->>M: end game
```
**Anything UNCLEAR:**
Provide as Plain text. Make clear here.
"""
TASK_WRITER_METAGPT_PROMPT = """#### Task Plan Assistance Guidance
You are a project manager, the goal is to break down tasks according to PRD/technical design, give a task list, and analyze task dependencies to start with the prerequisite modules
Based on the context, fill in the following missing information, note that all sections are returned in Python code triple quote form seperatedly.
Here the granularity of the task is a file, if there are any missing files, you can supplement them
8192 chars or 2048 tokens. Try to use them up.
ATTENTION: response carefully referenced "Response Output Format" in format.
#### Input Format
**Origin Query:** the initial question or objective that the user wanted to achieve
**Context:** the current status and history of the tasks to determine if Origin Query has been achieved.
#### Response Output Format
**Required Python third-party packages:** Provided in requirements.txt format
```python
flask==1.1.2
bcrypt==3.2.0
...
```
**Required Other language third-party packages:** Provided in requirements.txt format
```python
No third-party ...
```
**Full API spec:** Use OpenAPI 3.0. Describe all APIs that may be used by both frontend and backend.
```python
openapi: 3.0.0
...
description: A JSON object ...
```
**Logic Analysis:** Provided as a Python list[list[str]. the first is filename, the second is class/method/function should be implemented in this file. Analyze the dependencies between the files, which work should be done first
```python
[
["game.py", "Contains ..."],
]
```
**PLAN:** Provided as Python list[str]. Each str is a filename, the more at the beginning, the more it is a prerequisite dependency, should be done first
```python
[
"game.py",
]
```
**Shared Knowledge:** Anything that should be public like utils' functions, config's variables details that should make clear first.
```python
'game.py' contains ...
```
**Anything UNCLEAR:**
Provide as Plain text. Make clear here. For example, don't forget a main entry. don't forget to init 3rd party libs.
"""
CODE_WRITER_METAGPT_PROMPT = """#### Code Writer Assistance Guidance
You are a professional engineer; the main goal is to write PEP8 compliant, elegant, modular, easy to read and maintain Python 3.9 code (but you can also use other programming language)
Code: Write code with triple quoto, based on the following list and context.
1. Do your best to implement THIS ONLY ONE FILE. ONLY USE EXISTING API. IF NO API, IMPLEMENT IT.
2. Requirement: Based on the context, implement one following code file, note to return only in code form, your code will be part of the entire project, so please implement complete, reliable, reusable code snippets
3. Attention1: If there is any setting, ALWAYS SET A DEFAULT VALUE, ALWAYS USE STRONG TYPE AND EXPLICIT VARIABLE.
4. Attention2: YOU MUST FOLLOW "Data structures and interface definitions". DONT CHANGE ANY DESIGN.
5. Think before writing: What should be implemented and provided in this document?
6. CAREFULLY CHECK THAT YOU DONT MISS ANY NECESSARY CLASS/FUNCTION IN THIS FILE.
7. Do not use public member functions that do not exist in your design.
8. **$key:** is Input format or Output format, *$key* is the context infomation, they are different.
8192 chars or 2048 tokens. Try to use them up.
ATTENTION: response carefully referenced "Response Output Format" in format **$key:**.
#### Input Format
**Origin Query:** the user's origin query you should to be solved
**Context:** the current status and history of the tasks to determine if Origin Query has been achieved.
**Question:** clarify the current question to be solved
#### Response Output Format
**Action Status:** Coding2File
**SaveFileName** construct a local file name based on Question and Context, such as
```python
$projectname/$filename.py
```
**Code:** Write your code here
```python
# Write your code here
```
"""

View File

@ -1,114 +0,0 @@
PLANNER_TEMPLATE_PROMPT = """#### Planner Assistance Guidance
When users need assistance with generating a sequence of achievable tasks, your role is to provide a coherent and continuous plan.
Design the plan step by step, ensuring each task builds on the completion of the previous one.
Each instruction should be actionable and directly follow from the outcome of the preceding step.
ATTENTION: response carefully referenced "Response Output Format" in format.
#### Input Format
**Question:** First, clarify the problem to be solved.
#### Response Output Format
**Action Status:** Set to 'finished' or 'planning'.
If it's 'finished', the PLAN is to provide the final answer to the original question.
If it's 'planning', the PLAN is to provide a Python list[str] of achievable tasks.
**PLAN:**
```list
[
"First, we should ...",
]
```
"""
TOOL_PLANNER_PROMPT = """#### Tool Planner Assistance Guidance
Helps user to break down a process of tool usage into a series of plans.
If there are no available tools, can directly answer the question.
Rrespond to humans in the most helpful and accurate way possible.
You can use the following tool: {formatted_tools}
#### Input Format
**Origin Query:** the initial question or objective that the user wanted to achieve
**Context:** the current status and history of the tasks to determine if Origin Query has been achieved.
#### Response Output Format
**Action Status:** Set to 'finished' or 'planning'. If it's 'finished', the PLAN is to provide the final answer to the original question. If it's 'planning', the PLAN is to provide a sequence of achievable tasks.
**PLAN:**
```python
[
"First, we should ...",
]
```
"""
GENERAL_PLANNER_PROMPT = """你是一个通用计划拆解助手,将问题拆解问题成各个详细明确的步骤计划或直接回答问题,尽可能地以有帮助和准确的方式回应人类,
使用 JSON Blob 来指定一个返回的内容提供一个 action行动和一个 plans 生成的计划
有效的 'action' 值为'planning'(拆解计划) or 'only_answer' 不需要拆解问题即可直接回答问题
有效的 'plans' 值为: 一个任务列表按顺序写出需要执行的计划
在每个 $JSON_BLOB 中仅提供一个 action如下所示
```
{{'action': 'planning', 'plans': [$PLAN1, $PLAN2, $PLAN3, ..., $PLANN], }}
或者
{{'action': 'only_answer', 'plans': "直接回答问题", }}
```
按照以下格式进行回应
问题输入问题以回答
行动
```
$JSON_BLOB
```
"""
DATA_PLANNER_PROMPT = """你是一个数据分析助手,能够根据问题来制定一个详细明确的数据分析计划,尽可能地以有帮助和准确的方式回应人类,
使用 JSON Blob 来指定一个返回的内容提供一个 action行动和一个 plans 生成的计划
有效的 'action' 值为'planning'(拆解计划) or 'only_answer' 不需要拆解问题即可直接回答问题
有效的 'plans' 值为: 一份数据分析计划清单按顺序排列用文本表示
在每个 $JSON_BLOB 中仅提供一个 action如下所示
```
{{'action': 'planning', 'plans': '$PLAN1, $PLAN2, ..., $PLAN3' }}
```
按照以下格式进行回应
问题输入问题以回答
行动
```
$JSON_BLOB
```
"""
# TOOL_PLANNER_PROMPT = """你是一个工具使用过程的计划拆解助手,将问题拆解为一系列的工具使用计划,若没有可用工具则直接回答问题,尽可能地以有帮助和准确的方式回应人类,你可以使用以下工具:
# {formatted_tools}
# 使用 JSON Blob 来指定一个返回的内容,提供一个 action行动和一个 plans (生成的计划)。
# 有效的 'action' 值为:'planning'(拆解计划) or 'only_answer' (不需要拆解问题即可直接回答问题)。
# 有效的 'plans' 值为: 一个任务列表,按顺序写出需要使用的工具和使用该工具的理由
# 在每个 $JSON_BLOB 中仅提供一个 action如下两个示例所示
# ```
# {{'action': 'planning', 'plans': [$PLAN1, $PLAN2, $PLAN3, ..., $PLANN], }}
# ```
# 或者 若无法通过以上工具解决问题,则直接回答问题
# ```
# {{'action': 'only_answer', 'plans': "直接回答问题", }}
# ```
# 按照以下格式进行回应:
# 问题:输入问题以回答
# 行动:
# ```
# $JSON_BLOB
# ```
# """

View File

@ -1,54 +0,0 @@
QA_TEMPLATE_PROMPT = """#### Question Answer Assistance Guidance
Based on the information provided, please answer the origin query concisely and professionally.
Attention: Follow the input format and response output format
#### Input Format
**Origin Query:** the initial question or objective that the user wanted to achieve
**Context:** the current status and history of the tasks to determine if Origin Query has been achieved.
**DocInfos:**: the relevant doc information or code information, if this is empty, don't refer to this.
#### Response Output Format
**Action Status:** Set to 'Continued' or 'Stopped'.
**Answer:** Response to the user's origin query based on Context and DocInfos. If DocInfos is empty, you can ignore it.
If the answer cannot be derived from the given Context and DocInfos, please say 'The question cannot be answered based on the information provided' and do not add any fabricated elements to the answer.
"""
CODE_QA_PROMPT = """#### Code Answer Assistance Guidance
Based on the information provided, please answer the origin query concisely and professionally.
Attention: Follow the input format and response output format
#### Input Format
**Origin Query:** the initial question or objective that the user wanted to achieve
**DocInfos:**: the relevant doc information or code information, if this is empty, don't refer to this.
#### Response Output Format
**Action Status:** Set to 'Continued' or 'Stopped'.
**Answer:** Response to the user's origin query based on Context and DocInfos. If DocInfos is empty, you can ignore it.
If the answer cannot be derived from the given Context and DocInfos, please say 'The question cannot be answered based on the information provided' and do not add any fabricated elements to the answer.
"""
QA_PROMPT = """根据已知信息,简洁和专业的来回答问题。如果无法从中得到答案,请说 “根据已知信息无法回答该问题”,不允许在答案中添加编造成分,答案请使用中文。
使用 JSON Blob 来指定一个返回的内容提供一个 action行动
有效的 'action' 值为'finished'(任务已经可以通过上下文信息可以回答) or 'continue' 上下文信息不足以回答问题
在每个 $JSON_BLOB 中仅提供一个 action如下所示
```
{{'action': $ACTION, 'content': '总结对话内容'}}
```
按照以下格式进行回应
问题输入问题以回答
行动$ACTION
```
$JSON_BLOB
```
"""
# CODE_QA_PROMPT = """【指令】根据已知信息来回答问"""

View File

@ -1,34 +0,0 @@
REACT_CODE_PROMPT = """#### Writing Code Assistance Guidance
When users need help with coding, your role is to provide precise and effective guidance.
Write the code step by step, showing only the part necessary to solve the current problem. Each reply should contain only the code required for the current step.
#### Response Process
**Question:** First, clarify the problem to be solved.
**Thoughts:** Based on the question and observations above, provide the plan for executing this step.
**Action Status:** Set to 'stoped' or 'code_executing'. If it's 'stoped', the action is to provide the final answer to the original question. If it's 'code_executing', the action is to write the code.
**Action:**
```python
# Write your code here
import os
...
```
**Observation:** Check the results and effects of the executed code.
... (Repeat this Thoughts/Action/Observation cycle as needed)
**Thoughts:** I now know the final answer
**Action Status:** Set to 'stoped'
**Action:** The final answer to the original input question
"""

View File

@ -1,31 +0,0 @@
REACT_TEMPLATE_PROMPT = """#### Writing Code Assistance Guidance
When users need help with coding, your role is to provide precise and effective guidance. Write the code step by step, showing only the part necessary to solve the current problem. Each reply should contain only the code required for the current step.
#### Response Process
**Question:** First, clarify the problem to be solved.
**Thoughts:** Based on the question and observations above, provide the plan for executing this step.
**Action Status:** Set to 'stoped' or 'code_executing'. If it's 'stoped', the next action is to provide the final answer to the original question. If it's 'code_executing', the next step is to write the code.
**Action:** Code according to your thoughts. Use this format for code:
```python
# Write your code here
```
**Observation:** Check the results and effects of the executed code.
... (Repeat this Thoughts/Action/Observation cycle as needed)
**Thoughts:** I now know the final answer
**Action Status:** Set to 'stoped'
**Action:** The final answer to the original input question
"""

View File

@ -1,48 +0,0 @@
REACT_TOOL_AND_CODE_PLANNER_PROMPT = """#### Planner Assistance Guidance
When users seek assistance in breaking down complex issues into manageable and actionable steps,
your responsibility is to deliver a well-organized strategy or resolution through the use of tools or coding.
ATTENTION: response carefully referenced "Response Output Format" in format.
You may use the following tools:
{formatted_tools}
Depending on the user's query, the response will either be a plan detailing the use of tools and reasoning, or a direct answer if the problem does not require breaking down.
#### Input Format
**Question:** First, clarify the problem to be solved.
#### Response Output Format
**Action Status:** Set to 'planning' to provide a sequence of tasks, or 'only_answer' to provide a direct response without a plan.
**Action:**
```list
"First, we should ...",
]
```
Or, provide the direct answer.
"""
# REACT_TOOL_AND_CODE_PLANNER_PROMPT = """你是一个工具和代码使用过程的计划拆解助手,将问题拆解为一系列的工具使用计划,若没有可用工具则使用代码,尽可能地以有帮助和准确的方式回应人类,你可以使用以下工具:
# {formatted_tools}
# 使用 JSON Blob 来指定一个返回的内容,提供一个 action行动和一个 plans (生成的计划)。
# 有效的 'action' 值为:'planning'(拆解计划) or 'only_answer' (不需要拆解问题即可直接回答问题)。
# 有效的 'plans' 值为: 一个任务列表,按顺序写出需要使用的工具和使用该工具的理由
# 在每个 $JSON_BLOB 中仅提供一个 action如下两个示例所示
# ```
# {{'action': 'planning', 'plans': [$PLAN1, $PLAN2, $PLAN3, ..., $PLANN], }}
# ```
# 或者 若无法通过以上工具或者代码解决问题,则直接回答问题
# ```
# {{'action': 'only_answer', 'plans': "直接回答问题", }}
# ```
# 按照以下格式进行回应($JSON_BLOB要求符合上述规定
# 问题:输入问题以回答
# 行动:
# ```
# $JSON_BLOB
# ```
# """

View File

@ -1,83 +0,0 @@
REACT_TOOL_AND_CODE_PROMPT = """#### Code and Tool Agent Assistance Guidance
When users need help with coding or using tools, your role is to provide precise and effective guidance. Use the tools provided if they can solve the problem, otherwise, write the code step by step, showing only the part necessary to solve the current problem. Each reply should contain only the guidance required for the current step either by tool usage or code.
#### Tool Infomation
You can use these tools:\n{formatted_tools}
Valid "tool_name" value:\n{tool_names}
#### Response Process
**Question:** Start by understanding the input question to be answered.
**Thoughts:** Considering the user's question, previously executed steps, and the plan, decide whether the current step requires the use of a tool or code_executing. Solve the problem step by step, only displaying the thought process necessary for the current step of solving the problem. If a tool can be used, provide its name and parameters. If code_executing is required, outline the plan for executing this step.
**Action Status:** stoped, tool_using, or code_executing. (Choose one from these three statuses.)
If the task is done, set it to 'stoped'.
If using a tool, set it to 'tool_using'.
If writing code, set it to 'code_executing'.
**Action:**
If using a tool, use the tools by formatting the tool action in JSON from Question and Observation:. The format should be:
```json
{{
"tool_name": "$TOOL_NAME",
"tool_params": "$INPUT"
}}
```
If the problem cannot be solved with a tool at the moment, then proceed to solve the issue using code. Output the following format to execute the code:
```python
Write your code here
```
**Observation:** Check the results and effects of the executed action.
... (Repeat this Thoughts/Action/Observation cycle as needed)
**Thoughts:** Conclude the final response to the input question.
**Action Status:** stoped
**Action:** The final answer or guidance to the original input question.
"""
# REACT_TOOL_AND_CODE_PROMPT = """你是一个使用工具与代码的助手。
# 如果现有工具不足以完成整个任务,请不要添加不存在的工具,只使用现有工具完成可能的部分。
# 如果当前步骤不能使用工具完成,将由代码来完成。
# 有效的"action"值为:"stoped"(已经完成用户的任务) 、 "tool_using" (使用工具来回答问题) 或 'code_executing'(结合总结下述思维链过程编写下一步的可执行代码)。
# 尽可能地以有帮助和准确的方式回应人类,你可以使用以下工具:
# {formatted_tools}
# 如果现在的步骤可以用工具解决问题,请仅在每个$JSON_BLOB中提供一个action如下所示
# ```
# {{{{
# "action": $ACTION,
# "tool_name": $TOOL_NAME
# "tool_params": $INPUT
# }}}}
# ```
# 若当前无法通过工具解决问题,则使用代码解决问题
# 请仅在每个$JSON_BLOB中提供一个action如下所示
# ```
# {{{{'action': $ACTION,'code_content': $CODE}}}}
# ```
# 按照以下思维链格式进行回应($JSON_BLOB要求符合上述规定
# 问题:输入问题以回答
# 思考:考虑之前和之后的步骤
# 行动:
# ```
# $JSON_BLOB
# ```
# 观察:行动结果
# ...(重复思考/行动/观察N次
# 思考:我知道该如何回应
# 行动:
# ```
# $JSON_BLOB
# ```
# """

View File

@ -1,81 +0,0 @@
REACT_TOOL_PROMPT = """#### Tool Agent Assistance Guidance
When interacting with users, your role is to respond in a helpful and accurate manner using the tools available. Follow the steps below to ensure efficient and effective use of the tools.
Please note that all the tools you can use are listed below. You can only choose from these tools for use. If there are no suitable tools, please do not invent any tools. Just let the user know that you do not have suitable tools to use.
#### Tool List
you can use these tools:\n{formatted_tools}
valid "tool_name" value is:\n{tool_names}
#### Response Process
**Question:** Start by understanding the input question to be answered.
**Thoughts:** Based on the question and previous observations, plan the approach for using the tool effectively.
**Action Status:** Set to either 'stoped' or 'tool_using'. If 'stoped', provide the final response to the original question. If 'tool_using', proceed with using the specified tool.
**Action:** Use the tools by formatting the tool action in JSON. The format should be:
```json
{{
"tool_name": "$TOOL_NAME",
"tool_params": "$INPUT"
}}
```
**Observation:** Evaluate the outcome of the tool's usage.
... (Repeat this Thoughts/Action/Observation cycle as needed)
**Thoughts:** Determine the final response based on the results.
**Action Status:** Set to 'stoped'
**Action:** Conclude with the final response to the original question in this format:
```json
{{
"tool_params": "Final response to be provided to the user",
"tool_name": "notool",
}}
```
"""
# REACT_TOOL_PROMPT = """尽可能地以有帮助和准确的方式回应人类。您可以使用以下工具:
# {formatted_tools}
# 使用json blob来指定一个工具提供一个action关键字工具名称和一个tool_params关键字工具输入
# 有效的"action"值为:"stoped" 或 "tool_using" (使用工具来回答问题)
# 有效的"tool_name"值为:{tool_names}
# 请仅在每个$JSON_BLOB中提供一个action如下所示
# ```
# {{{{
# "action": $ACTION,
# "tool_name": $TOOL_NAME,
# "tool_params": $INPUT
# }}}}
# ```
# 按照以下格式进行回应:
# 问题:输入问题以回答
# 思考:考虑之前和之后的步骤
# 行动:
# ```
# $JSON_BLOB
# ```
# 观察:行动结果
# ...(重复思考/行动/观察N次
# 思考:我知道该如何回应
# 行动:
# ```
# {{{{
# "action": "stoped",
# "tool_name": "notool",
# "tool_params": "最终返回答案给到用户"
# }}}}
# ```
# """

View File

@ -1,30 +0,0 @@
REFINE_TEMPLATE_PROMPT = """#### Refiner Assistance Guidance
When users have a sequence of tasks that require optimization or adjustment based on feedback from the context, your role is to refine the existing plan.
Your task is to identify where improvements can be made and provide a revised plan that is more efficient or effective.
Each instruction should be an enhancement of the existing plan and should specify the step from which the changes should be implemented.
#### Input Format
**Context:** Review the history of the plan and feedback to identify areas for improvement.
Take into consideration all feedback information from the current step. If there is no existing plan, generate a new one.
#### Response Output Format
**REASON:** think the reason of why choose 'finished', 'unchanged' or 'adjusted' step by step.
**Action Status:** Set to 'finished', 'unchanged' or 'adjusted'.
If it's 'finished', all tasks are accomplished, and no adjustments are needed, so PLAN_STEP is set to -1.
If it's 'unchanged', this PLAN has no problem, just set PLAN_STEP to CURRENT_STEP+1.
If it's 'adjusted', the PLAN is to provide an optimized version of the original plan.
**PLAN:**
```list
[
"First, we should ...",
]
```
**PLAN_STEP:** Set to the plan index from which the changes should start. Index range from 0 to n-1 or -1
If it's 'finished', the PLAN_STEP is -1. If it's 'adjusted', the PLAN_STEP is the index of the first revised task in the sequence.
"""

View File

@ -1,20 +0,0 @@
CONV_SUMMARY_PROMPT = """尽可能地以有帮助和准确的方式回应人类,根据“背景信息”中的有效信息回答问题,
使用 JSON Blob 来指定一个返回的内容提供一个 action行动
有效的 'action' 值为'finished'(任务已经可以通过上下文信息可以回答) or 'continue' 根据背景信息回答问题
在每个 $JSON_BLOB 中仅提供一个 action如下所示
```
{{'action': $ACTION, 'content': '根据背景信息回答问题'}}
```
按照以下格式进行回应
问题输入问题以回答
行动
```
$JSON_BLOB
```
"""
CONV_SUMMARY_PROMPT = """尽可能地以有帮助和准确的方式回应人类
根据背景信息中的有效信息回答问题同时展现解答的过程和内容
若能根背景信息回答问题则直接回答
否则总结背景信息的内容
"""

View File

@ -1,396 +0,0 @@
import re, traceback, uuid, copy, json, os
from loguru import logger
from configs.server_config import SANDBOX_SERVER
from configs.model_config import JUPYTER_WORK_PATH
from dev_opsgpt.connector.schema import (
Memory, Task, Env, Role, Message, ActionStatus, CodeDoc, Doc
)
from dev_opsgpt.tools import DDGSTool, DocRetrieval, CodeRetrieval
from dev_opsgpt.sandbox import PyCodeBox, CodeBoxResponse
class MessageUtils:
def __init__(self, role: Role = None) -> None:
self.role = role
self.codebox = PyCodeBox(
remote_url=SANDBOX_SERVER["url"],
remote_ip=SANDBOX_SERVER["host"],
remote_port=SANDBOX_SERVER["port"],
token="mytoken",
do_code_exe=True,
do_remote=SANDBOX_SERVER["do_remote"],
do_check_net=False
)
def filter(self, message: Message, stop=None) -> Message:
tool_params = self.parser_spec_key(message.role_content, "tool_params")
code_content = self.parser_spec_key(message.role_content, "code_content")
plan = self.parser_spec_key(message.role_content, "plan")
plans = self.parser_spec_key(message.role_content, "plans", do_search=False)
content = self.parser_spec_key(message.role_content, "content", do_search=False)
# logger.debug(f"tool_params: {tool_params}, code_content: {code_content}, plan: {plan}, plans: {plans}, content: {content}")
role_content = tool_params or code_content or plan or plans or content
message.role_content = role_content or message.role_content
return message
def inherit_extrainfo(self, input_message: Message, output_message: Message):
output_message.db_docs = input_message.db_docs
output_message.search_docs = input_message.search_docs
output_message.code_docs = input_message.code_docs
output_message.figures.update(input_message.figures)
output_message.origin_query = input_message.origin_query
output_message.code_engine_name = input_message.code_engine_name
output_message.doc_engine_name = input_message.doc_engine_name
output_message.search_engine_name = input_message.search_engine_name
output_message.top_k = input_message.top_k
output_message.score_threshold = input_message.score_threshold
output_message.cb_search_type = input_message.cb_search_type
output_message.do_doc_retrieval = input_message.do_doc_retrieval
output_message.do_code_retrieval = input_message.do_code_retrieval
output_message.do_tool_retrieval = input_message.do_tool_retrieval
#
output_message.tools = input_message.tools
output_message.agents = input_message.agents
# 存在bug导致相同key被覆盖
output_message.customed_kargs.update(input_message.customed_kargs)
return output_message
def inherit_baseparam(self, input_message: Message, output_message: Message):
# 只更新参数
output_message.doc_engine_name = input_message.doc_engine_name
output_message.search_engine_name = input_message.search_engine_name
output_message.top_k = input_message.top_k
output_message.score_threshold = input_message.score_threshold
output_message.cb_search_type = input_message.cb_search_type
output_message.do_doc_retrieval = input_message.do_doc_retrieval
output_message.do_code_retrieval = input_message.do_code_retrieval
output_message.do_tool_retrieval = input_message.do_tool_retrieval
#
output_message.tools = input_message.tools
output_message.agents = input_message.agents
# 存在bug导致相同key被覆盖
output_message.customed_kargs.update(input_message.customed_kargs)
return output_message
def get_extrainfo_step(self, message: Message, do_search, do_doc_retrieval, do_code_retrieval, do_tool_retrieval) -> Message:
''''''
if do_search:
message = self.get_search_retrieval(message)
if do_doc_retrieval:
message = self.get_doc_retrieval(message)
if do_code_retrieval:
input_message = self.get_code_retrieval(message)
if do_tool_retrieval:
message = self.get_tool_retrieval(message)
return message
def get_search_retrieval(self, message: Message,) -> Message:
SEARCH_ENGINES = {"duckduckgo": DDGSTool}
search_docs = []
for idx, doc in enumerate(SEARCH_ENGINES["duckduckgo"].run(message.role_content, 3)):
doc.update({"index": idx})
search_docs.append(Doc(**doc))
message.search_docs = search_docs
return message
def get_doc_retrieval(self, message: Message) -> Message:
query = message.role_content
knowledge_basename = message.doc_engine_name
top_k = message.top_k
score_threshold = message.score_threshold
if knowledge_basename:
docs = DocRetrieval.run(query, knowledge_basename, top_k, score_threshold)
message.db_docs = [Doc(**doc) for doc in docs]
return message
def get_code_retrieval(self, message: Message) -> Message:
# DocRetrieval.run("langchain是什么", "DSADSAD")
query = message.input_query
code_engine_name = message.code_engine_name
history_node_list = message.history_node_list
code_docs = CodeRetrieval.run(code_engine_name, query, code_limit=message.top_k, history_node_list=history_node_list, search_type=message.cb_search_type)
message.code_docs = [CodeDoc(**doc) for doc in code_docs]
return message
def get_tool_retrieval(self, message: Message) -> Message:
return message
def step_router(self, message: Message, history: Memory = None, background: Memory = None, memory_pool: Memory=None) -> tuple[Message, ...]:
''''''
# message = self.parser(message)
# logger.debug(f"message.action_status: {message.action_status}")
observation_message = None
if message.action_status == ActionStatus.CODE_EXECUTING:
message, observation_message = self.code_step(message)
elif message.action_status == ActionStatus.TOOL_USING:
message, observation_message = self.tool_step(message)
elif message.action_status == ActionStatus.CODING2FILE:
self.save_code2file(message)
elif message.action_status == ActionStatus.CODE_RETRIEVAL:
pass
elif message.action_status == ActionStatus.CODING:
pass
return message, observation_message
def code_step(self, message: Message) -> Message:
'''execute code'''
# logger.debug(f"message.role_content: {message.role_content}, message.code_content: {message.code_content}")
code_answer = self.codebox.chat('```python\n{}```'.format(message.code_content))
code_prompt = f"The return error after executing the above code is {code_answer.code_exe_response}need to recover" \
if code_answer.code_exe_type == "error" else f"The return information after executing the above code is {code_answer.code_exe_response}"
observation_message = Message(
role_name="observation",
role_type="func", #self.role.role_type,
role_content="",
step_content="",
input_query=message.code_content,
)
uid = str(uuid.uuid1())
if code_answer.code_exe_type == "image/png":
message.figures[uid] = code_answer.code_exe_response
message.code_answer = f"\n**Observation:**: The return figure name is {uid} after executing the above code.\n"
message.observation = f"\n**Observation:**: The return figure name is {uid} after executing the above code.\n"
message.step_content += f"\n**Observation:**: The return figure name is {uid} after executing the above code.\n"
# message.role_content += f"\n**Observation:**:执行上述代码后生成一张图片, 图片名为{uid}\n"
observation_message.role_content = f"\n**Observation:**: The return figure name is {uid} after executing the above code.\n"
observation_message.parsed_output = {"Observation": f"The return figure name is {uid} after executing the above code."}
else:
message.code_answer = code_answer.code_exe_response
message.observation = code_answer.code_exe_response
message.step_content += f"\n**Observation:**: {code_prompt}\n"
# message.role_content += f"\n**Observation:**: {code_prompt}\n"
observation_message.role_content = f"\n**Observation:**: {code_prompt}\n"
observation_message.parsed_output = {"Observation": code_prompt}
# logger.info(f"**Observation:** {message.action_status}, {message.observation}")
return message, observation_message
def tool_step(self, message: Message) -> Message:
'''execute tool'''
# logger.debug(f"{message}")
observation_message = Message(
role_name="observation",
role_type="function", #self.role.role_type,
role_content="\n**Observation:** there is no tool can execute\n" ,
step_content="",
input_query=str(message.tool_params),
tools=message.tools,
)
# logger.debug(f"message: {message.action_status}, {message.tool_name}, {message.tool_params}")
tool_names = [tool.name for tool in message.tools]
if message.tool_name not in tool_names:
message.tool_answer = "\n**Observation:** there is no tool can execute\n"
message.observation = "\n**Observation:** there is no tool can execute\n"
# message.role_content += f"\n**Observation:**: 不存在可以执行的tool\n"
message.step_content += f"\n**Observation:** there is no tool can execute\n"
observation_message.role_content = f"\n**Observation:** there is no tool can execute\n"
observation_message.parsed_output = {"Observation": "there is no tool can execute\n"}
for tool in message.tools:
if tool.name == message.tool_name:
tool_res = tool.func(**message.tool_params.get("tool_params", {}))
logger.debug(f"tool_res {tool_res}")
message.tool_answer = tool_res
message.observation = tool_res
# message.role_content += f"\n**Observation:**: {tool_res}\n"
message.step_content += f"\n**Observation:** {tool_res}\n"
observation_message.role_content = f"\n**Observation:** {tool_res}\n"
observation_message.parsed_output = {"Observation": tool_res}
break
# logger.info(f"**Observation:** {message.action_status}, {message.observation}")
return message, observation_message
def parser(self, message: Message) -> Message:
''''''
content = message.role_content
parser_keys = ["action", "code_content", "code_filename", "tool_params", "plans"]
try:
s_json = self._parse_json(content)
message.action_status = s_json.get("action")
message.code_content = s_json.get("code_content")
message.tool_params = s_json.get("tool_params")
message.tool_name = s_json.get("tool_name")
message.code_filename = s_json.get("code_filename")
message.plans = s_json.get("plans")
# for parser_key in parser_keys:
# message.action_status = content.get(parser_key)
except Exception as e:
# logger.warning(f"{traceback.format_exc()}")
def parse_text_to_dict(text):
# Define a regular expression pattern to capture the key and value
main_pattern = r"\*\*(.+?):\*\*\s*(.*?)\s*(?=\*\*|$)"
list_pattern = r'```python\n(.*?)```'
# Use re.findall to find all main matches in the text
main_matches = re.findall(main_pattern, text, re.DOTALL)
# Convert main matches to a dictionary
parsed_dict = {key.strip(): value.strip() for key, value in main_matches}
for k, v in parsed_dict.items():
for pattern in [list_pattern]:
if "PLAN" != k: continue
v = v.replace("```list", "```python")
match_value = re.search(pattern, v, re.DOTALL)
if match_value:
# Add the code block to the dictionary
parsed_dict[k] = eval(match_value.group(1).strip())
break
return parsed_dict
def extract_content_from_backticks(text):
code_blocks = []
lines = text.split('\n')
is_code_block = False
code_block = ''
language = ''
for line in lines:
if line.startswith('```') and not is_code_block:
is_code_block = True
language = line[3:]
code_block = ''
elif line.startswith('```') and is_code_block:
is_code_block = False
code_blocks.append({language.strip(): code_block.strip()})
elif is_code_block:
code_block += line + '\n'
return code_blocks
def parse_dict_to_dict(parsed_dict) -> dict:
code_pattern = r'```python\n(.*?)```'
tool_pattern = r'```json\n(.*?)```'
pattern_dict = {"code": code_pattern, "json": tool_pattern}
spec_parsed_dict = copy.deepcopy(parsed_dict)
for key, pattern in pattern_dict.items():
for k, text in parsed_dict.items():
# Search for the code block
if not isinstance(text, str): continue
_match = re.search(pattern, text, re.DOTALL)
if _match:
# Add the code block to the dictionary
try:
spec_parsed_dict[key] = json.loads(_match.group(1).strip())
except:
spec_parsed_dict[key] = _match.group(1).strip()
break
return spec_parsed_dict
parsed_dict = parse_text_to_dict(content)
spec_parsed_dict = parse_dict_to_dict(parsed_dict)
action_value = parsed_dict.get('Action Status')
if action_value:
action_value = action_value.lower()
logger.info(f'{message.role_name}: action_value: {action_value}')
# action_value = self._match(r"'action':\s*'([^']*)'", content) if "'action'" in content else self._match(r'"action":\s*"([^"]*)"', content)
code_content_value = spec_parsed_dict.get('code')
# code_content_value = self._match(r"'code_content':\s*'([^']*)'", content) if "'code_content'" in content else self._match(r'"code_content":\s*"([^"]*)"', content)
filename_value = self._match(r"'code_filename':\s*'([^']*)'", content) if "'code_filename'" in content else self._match(r'"code_filename":\s*"([^"]*)"', content)
if action_value == 'tool_using':
tool_params_value = spec_parsed_dict.get('json')
else:
tool_params_value = None
# tool_params_value = spec_parsed_dict.get('tool_params')
# tool_params_value = self._match(r"'tool_params':\s*(\{[^{}]*\})", content, do_json=True) if "'tool_params'" in content \
# else self._match(r'"tool_params":\s*(\{[^{}]*\})', content, do_json=True)
tool_name_value = self._match(r"'tool_name':\s*'([^']*)'", content) if "'tool_name'" in content else self._match(r'"tool_name":\s*"([^"]*)"', content)
plans_value = self._match(r"'plans':\s*(\[.*?\])", content, do_search=False) if "'plans'" in content else self._match(r'"plans":\s*(\[.*?\])', content, do_search=False, )
# re解析
message.action_status = action_value or "default"
message.code_content = code_content_value
message.code_filename = filename_value
message.tool_params = tool_params_value
message.tool_name = tool_name_value
message.plans = plans_value
message.parsed_output = parsed_dict
message.spec_parsed_output = spec_parsed_dict
# logger.debug(f"确认当前的action: {message.action_status}")
return message
def parser_spec_key(self, content, key, do_search=True, do_json=False) -> str:
''''''
key2pattern = {
"'action'": r"'action':\s*'([^']*)'", '"action"': r'"action":\s*"([^"]*)"',
"'code_content'": r"'code_content':\s*'([^']*)'", '"code_content"': r'"code_content":\s*"([^"]*)"',
"'code_filename'": r"'code_filename':\s*'([^']*)'", '"code_filename"': r'"code_filename":\s*"([^"]*)"',
"'tool_params'": r"'tool_params':\s*(\{[^{}]*\})", '"tool_params"': r'"tool_params":\s*(\{[^{}]*\})',
"'tool_name'": r"'tool_name':\s*'([^']*)'", '"tool_name"': r'"tool_name":\s*"([^"]*)"',
"'plans'": r"'plans':\s*(\[.*?\])", '"plans"': r'"plans":\s*(\[.*?\])',
"'content'": r"'content':\s*'([^']*)'", '"content"': r'"content":\s*"([^"]*)"',
}
s_json = self._parse_json(content)
try:
if s_json and key in s_json:
return str(s_json[key])
except:
pass
keystr = f"'{key}'" if f"'{key}'" in content else f'"{key}"'
return self._match(key2pattern.get(keystr, fr"'{key}':\s*'([^']*)'"), content, do_search=do_search, do_json=do_json)
def _match(self, pattern, s, do_search=True, do_json=False):
try:
if do_search:
match = re.search(pattern, s)
if match:
value = match.group(1).replace("\\n", "\n")
if do_json:
value = json.loads(value)
else:
value = None
else:
match = re.findall(pattern, s, re.DOTALL)
if match:
value = match[0]
if do_json:
value = json.loads(value)
else:
value = None
except Exception as e:
logger.warning(f"{traceback.format_exc()}")
# logger.debug(f"pattern: {pattern}, s: {s}, match: {match}")
return value
def _parse_json(self, s):
try:
pattern = r"```([^`]+)```"
match = re.findall(pattern, s)
if match:
return eval(match[0])
except:
pass
return None
def save_code2file(self, message: Message, project_dir=JUPYTER_WORK_PATH):
filename = message.parsed_output.get("SaveFileName")
code = message.spec_parsed_output.get("code")
for k, v in {"&gt;": ">", "&ge;": ">=", "&lt;": "<", "&le;": "<="}.items():
code = code.replace(k, v)
file_path = os.path.join(project_dir, filename)
if not os.path.exists(file_path):
os.makedirs(os.path.dirname(file_path), exist_ok=True)
with open(file_path, "w") as f:
f.write(code)

View File

@ -1,3 +0,0 @@
from .base_phase import BasePhase
__all__ = ["BasePhase"]

View File

@ -1,256 +0,0 @@
from typing import List, Union, Dict, Tuple
import os
import json
import importlib
import copy
from loguru import logger
from dev_opsgpt.connector.agents import BaseAgent, SelectorAgent
from dev_opsgpt.connector.chains import BaseChain
from dev_opsgpt.tools.base_tool import BaseTools, Tool
from dev_opsgpt.connector.schema import (
Memory, Task, Env, Role, Message, Doc, AgentConfig, ChainConfig, PhaseConfig, CodeDoc,
load_chain_configs, load_phase_configs, load_role_configs
)
from dev_opsgpt.connector.configs import AGETN_CONFIGS, CHAIN_CONFIGS, PHASE_CONFIGS
from dev_opsgpt.connector.message_process import MessageUtils
role_configs = load_role_configs(AGETN_CONFIGS)
chain_configs = load_chain_configs(CHAIN_CONFIGS)
phase_configs = load_phase_configs(PHASE_CONFIGS)
CUR_DIR = os.path.dirname(os.path.abspath(__file__))
class BasePhase:
def __init__(
self,
phase_name: str,
task: Task = None,
do_summary: bool = False,
do_search: bool = False,
do_doc_retrieval: bool = False,
do_code_retrieval: bool = False,
do_tool_retrieval: bool = False,
phase_config: Union[dict, str] = PHASE_CONFIGS,
chain_config: Union[dict, str] = CHAIN_CONFIGS,
role_config: Union[dict, str] = AGETN_CONFIGS,
) -> None:
self.conv_summary_agent = BaseAgent(role=role_configs["conv_summary"].role,
task = None,
memory = None,
do_search = role_configs["conv_summary"].do_search,
do_doc_retrieval = role_configs["conv_summary"].do_doc_retrieval,
do_tool_retrieval = role_configs["conv_summary"].do_tool_retrieval,
do_filter=False, do_use_self_memory=False)
self.chains: List[BaseChain] = self.init_chains(
phase_name,
task=task,
memory=None,
phase_config = phase_config,
chain_config = chain_config,
role_config = role_config,
)
self.message_utils = MessageUtils()
self.phase_name = phase_name
self.do_summary = do_summary
self.do_search = do_search
self.do_code_retrieval = do_code_retrieval
self.do_doc_retrieval = do_doc_retrieval
self.do_tool_retrieval = do_tool_retrieval
#
self.global_memory = Memory(messages=[])
# self.chain_message = Memory([])
self.phase_memory: List[Memory] = []
# memory_pool dont have specific order
self.memory_pool = Memory(messages=[])
def astep(self, query: Message, history: Memory = None) -> Tuple[Message, Memory]:
summary_message = None
chain_message = Memory(messages=[])
local_phase_memory = Memory(messages=[])
# do_search、do_doc_search、do_code_search
query = self.message_utils.get_extrainfo_step(query, self.do_search, self.do_doc_retrieval, self.do_code_retrieval, self.do_tool_retrieval)
input_message = copy.deepcopy(query)
self.global_memory.append(input_message)
local_phase_memory.append(input_message)
for chain in self.chains:
# chain can supply background and query to next chain
for output_message, local_chain_memory in chain.astep(input_message, history, background=chain_message, memory_pool=self.memory_pool):
# logger.debug(f"local_memory: {local_memory + chain_memory}")
yield output_message, local_phase_memory + local_chain_memory
output_message = self.message_utils.inherit_extrainfo(input_message, output_message)
input_message = output_message
logger.info(f"{chain.chainConfig.chain_name} phase_step: {output_message.role_content}")
# 这一段也有问题
self.global_memory.extend(local_chain_memory)
local_phase_memory.extend(local_chain_memory)
# whether to use summary_llm
if self.do_summary:
logger.info(f"{self.conv_summary_agent.role.role_name} input global memory: {local_phase_memory.to_str_messages(content_key='step_content')}")
for summary_message in self.conv_summary_agent.arun(query, background=local_phase_memory, memory_pool=self.memory_pool):
pass
# summary_message = Message(**summary_message)
summary_message.role_name = chain.chainConfig.chain_name
summary_message = self.conv_summary_agent.message_utils.parser(summary_message)
summary_message = self.conv_summary_agent.message_utils.filter(summary_message)
summary_message = self.message_utils.inherit_extrainfo(output_message, summary_message)
chain_message.append(summary_message)
message = summary_message or output_message
yield message, local_phase_memory
# 由于不会存在多轮chain执行所以直接保留memory即可
for chain in self.chains:
self.phase_memory.append(chain.global_memory)
# TODOlocal_memory缺少添加summary的过程
message = summary_message or output_message
message.role_name = self.phase_name
yield message, local_phase_memory
def step(self, query: Message, history: Memory = None) -> Tuple[Message, Memory]:
for message, local_phase_memory in self.astep(query, history=history):
pass
return message, local_phase_memory
def init_chains(self, phase_name, phase_config, chain_config,
role_config, task=None, memory=None) -> List[BaseChain]:
# load config
role_configs = load_role_configs(role_config)
chain_configs = load_chain_configs(chain_config)
phase_configs = load_phase_configs(phase_config)
chains = []
self.chain_module = importlib.import_module("dev_opsgpt.connector.chains")
self.agent_module = importlib.import_module("dev_opsgpt.connector.agents")
phase = phase_configs.get(phase_name)
logger.info(f"start to init the phase, the phase_name is {phase_name}, it contains these chains such as {phase.chains}")
for chain_name in phase.chains:
# logger.debug(f"{chain_configs.keys()}")
chain_config = chain_configs[chain_name]
logger.info(f"start to init the chain, the chain_name is {chain_name}, it contains these agents such as {chain_config.agents}")
agents = []
for agent_name in chain_config.agents:
agent_config = role_configs[agent_name]
baseAgent: BaseAgent = getattr(self.agent_module, agent_config.role.agent_type)
base_agent = baseAgent(
agent_config.role,
task = task,
memory = memory,
chat_turn=agent_config.chat_turn,
do_search = agent_config.do_search,
do_doc_retrieval = agent_config.do_doc_retrieval,
do_tool_retrieval = agent_config.do_tool_retrieval,
stop= agent_config.stop,
focus_agents=agent_config.focus_agents,
focus_message_keys=agent_config.focus_message_keys,
)
if agent_config.role.agent_type == "SelectorAgent":
for group_agent_name in agent_config.group_agents:
group_agent_config = role_configs[group_agent_name]
baseAgent: BaseAgent = getattr(self.agent_module, group_agent_config.role.agent_type)
group_base_agent = baseAgent(
group_agent_config.role,
task = task,
memory = memory,
chat_turn=group_agent_config.chat_turn,
do_search = group_agent_config.do_search,
do_doc_retrieval = group_agent_config.do_doc_retrieval,
do_tool_retrieval = group_agent_config.do_tool_retrieval,
stop= group_agent_config.stop,
focus_agents=group_agent_config.focus_agents,
focus_message_keys=group_agent_config.focus_message_keys,
)
base_agent.group_agents.append(group_base_agent)
agents.append(base_agent)
chain_instance = BaseChain(
chain_config, agents, chain_config.chat_turn,
do_checker=chain_configs[chain_name].do_checker,
)
chains.append(chain_instance)
return chains
# def get_extrainfo_step(self, input_message):
# if self.do_doc_retrieval:
# input_message = self.get_doc_retrieval(input_message)
# # logger.debug(F"self.do_code_retrieval: {self.do_code_retrieval}")
# if self.do_code_retrieval:
# input_message = self.get_code_retrieval(input_message)
# if self.do_search:
# input_message = self.get_search_retrieval(input_message)
# return input_message
# def inherit_extrainfo(self, input_message: Message, output_message: Message):
# output_message.db_docs = input_message.db_docs
# output_message.search_docs = input_message.search_docs
# output_message.code_docs = input_message.code_docs
# output_message.figures.update(input_message.figures)
# output_message.origin_query = input_message.origin_query
# return output_message
# def get_search_retrieval(self, message: Message,) -> Message:
# SEARCH_ENGINES = {"duckduckgo": DDGSTool}
# search_docs = []
# for idx, doc in enumerate(SEARCH_ENGINES["duckduckgo"].run(message.role_content, 3)):
# doc.update({"index": idx})
# search_docs.append(Doc(**doc))
# message.search_docs = search_docs
# return message
# def get_doc_retrieval(self, message: Message) -> Message:
# query = message.role_content
# knowledge_basename = message.doc_engine_name
# top_k = message.top_k
# score_threshold = message.score_threshold
# if knowledge_basename:
# docs = DocRetrieval.run(query, knowledge_basename, top_k, score_threshold)
# message.db_docs = [Doc(**doc) for doc in docs]
# return message
# def get_code_retrieval(self, message: Message) -> Message:
# # DocRetrieval.run("langchain是什么", "DSADSAD")
# query = message.input_query
# code_engine_name = message.code_engine_name
# history_node_list = message.history_node_list
# code_docs = CodeRetrieval.run(code_engine_name, query, code_limit=message.top_k, history_node_list=history_node_list)
# message.code_docs = [CodeDoc(**doc) for doc in code_docs]
# return message
# def get_tool_retrieval(self, message: Message) -> Message:
# return message
def update(self) -> Memory:
pass
def get_memory(self, ) -> Memory:
return Memory.from_memory_list(
[chain.get_memory() for chain in self.chains]
)
def get_memory_str(self, do_all_memory=True, content_key="role_content") -> str:
memory = self.global_memory if do_all_memory else self.phase_memory
return "\n".join([": ".join(i) for i in memory.to_tuple_messages(content_key=content_key)])
def get_chains_memory(self, content_key="role_content") -> List[Tuple]:
return [memory.to_tuple_messages(content_key=content_key) for memory in self.phase_memory]
def get_chains_memory_str(self, content_key="role_content") -> str:
return "************".join([f"{chain.chainConfig.chain_name}\n" + chain.get_memory_str(content_key=content_key) for chain in self.chains])

View File

@ -1,9 +0,0 @@
from .memory import Memory
from .general_schema import *
from .message import Message
__all__ = [
"Memory", "ActionStatus", "Doc", "CodeDoc", "Task",
"Env", "Role", "ChainConfig", "AgentConfig", "PhaseConfig", "Message",
"load_role_configs", "load_chain_configs", "load_phase_configs"
]

View File

@ -1,257 +0,0 @@
from pydantic import BaseModel
from typing import List, Dict
from enum import Enum
import re
import json
from loguru import logger
from langchain.tools import BaseTool
class ActionStatus(Enum):
DEFAUILT = "default"
FINISHED = "finished"
STOPED = "stoped"
CONTINUED = "continued"
TOOL_USING = "tool_using"
CODING = "coding"
CODE_EXECUTING = "code_executing"
CODING2FILE = "coding2file"
PLANNING = "planning"
UNCHANGED = "unchanged"
ADJUSTED = "adjusted"
CODE_RETRIEVAL = "code_retrieval"
def __eq__(self, other):
if isinstance(other, str):
return self.value.lower() == other.lower()
return super().__eq__(other)
class Action(BaseModel):
action_name: str
description: str
class FinishedAction(Action):
action_name: str = ActionStatus.FINISHED
description: str = "provide the final answer to the original query to break the chain answer"
class StopedAction(Action):
action_name: str = ActionStatus.STOPED
description: str = "provide the final answer to the original query to break the agent answer"
class ContinuedAction(Action):
action_name: str = ActionStatus.CONTINUED
description: str = "cant't provide the final answer to the original query"
class ToolUsingAction(Action):
action_name: str = ActionStatus.TOOL_USING
description: str = "proceed with using the specified tool."
class CodingdAction(Action):
action_name: str = ActionStatus.CODING
description: str = "provide the answer by writing code"
class Coding2FileAction(Action):
action_name: str = ActionStatus.CODING2FILE
description: str = "provide the answer by writing code and filename"
class CodeExecutingAction(Action):
action_name: str = ActionStatus.CODE_EXECUTING
description: str = "provide the answer by writing executable code"
class PlanningAction(Action):
action_name: str = ActionStatus.PLANNING
description: str = "provide a sequence of tasks"
class UnchangedAction(Action):
action_name: str = ActionStatus.UNCHANGED
description: str = "this PLAN has no problem, just set PLAN_STEP to CURRENT_STEP+1."
class AdjustedAction(Action):
action_name: str = ActionStatus.ADJUSTED
description: str = "the PLAN is to provide an optimized version of the original plan."
# extended action exmaple
class CodeRetrievalAction(Action):
action_name: str = ActionStatus.CODE_RETRIEVAL
description: str = "execute the code retrieval to acquire more code information"
class RoleTypeEnums(Enum):
SYSTEM = "system"
USER = "user"
ASSISTANT = "assistant"
FUNCTION = "function"
OBSERVATION = "observation"
def __eq__(self, other):
if isinstance(other, str):
return self.value == other
return super().__eq__(other)
class PromptKey(BaseModel):
key_name: str
description: str
class PromptKeyEnums(Enum):
# Origin Query is ui's user question
ORIGIN_QUERY = "origin_query"
# agent's input from last agent
CURRENT_QUESTION = "current_question"
# ui memory contaisn (user and assistants)
UI_MEMORY = "ui_memory"
# agent's memory
SELF_MEMORY = "self_memory"
# chain memory
CHAIN_MEMORY = "chain_memory"
# agent's memory
SELF_LOCAL_MEMORY = "self_local_memory"
# chain memory
CHAIN_LOCAL_MEMORY = "chain_local_memory"
# Doc Infomations contains (Doc\Code\Search)
DOC_INFOS = "doc_infos"
def __eq__(self, other):
if isinstance(other, str):
return self.value == other
return super().__eq__(other)
class Doc(BaseModel):
title: str
snippet: str
link: str
index: int
def get_title(self):
return self.title
def get_snippet(self, ):
return self.snippet
def get_link(self, ):
return self.link
def get_index(self, ):
return self.index
def to_json(self):
return vars(self)
def __str__(self,):
return f"""出处 [{self.index + 1}] 标题 [{self.title}]\n\n来源 ({self.link}) \n\n内容 {self.snippet}\n\n"""
class CodeDoc(BaseModel):
code: str
related_nodes: list
index: int
def get_code(self, ):
return self.code
def get_related_node(self, ):
return self.related_nodes
def get_index(self, ):
return self.index
def to_json(self):
return vars(self)
def __str__(self,):
return f"""出处 [{self.index + 1}] \n\n来源 ({self.related_nodes}) \n\n内容 {self.code}\n\n"""
class Task(BaseModel):
task_type: str
task_name: str
task_desc: str
task_prompt: str
# def __init__(self, task_type, task_name, task_desc) -> None:
# self.task_type = task_type
# self.task_name = task_name
# self.task_desc = task_desc
class Env(BaseModel):
env_type: str
env_name: str
env_desc:str
class Role(BaseModel):
role_type: str
role_name: str
role_desc: str
agent_type: str = ""
role_prompt: str = ""
template_prompt: str = ""
class ChainConfig(BaseModel):
chain_name: str
chain_type: str
agents: List[str]
do_checker: bool = False
chat_turn: int = 1
clear_structure: bool = False
brainstorming: bool = False
gui_design: bool = True
git_management: bool = False
self_improve: bool = False
class AgentConfig(BaseModel):
role: Role
stop: str = None
chat_turn: int = 1
do_search: bool = False
do_doc_retrieval: bool = False
do_tool_retrieval: bool = False
focus_agents: List = []
focus_message_keys: List = []
group_agents: List = []
class PhaseConfig(BaseModel):
phase_name: str
phase_type: str
chains: List[str]
do_summary: bool = False
do_search: bool = False
do_doc_retrieval: bool = False
do_code_retrieval: bool = False
do_tool_retrieval: bool = False
def load_role_configs(config) -> Dict[str, AgentConfig]:
if isinstance(config, str):
with open(config, 'r', encoding="utf8") as file:
configs = json.load(file)
else:
configs = config
return {name: AgentConfig(**v) for name, v in configs.items()}
def load_chain_configs(config) -> Dict[str, ChainConfig]:
if isinstance(config, str):
with open(config, 'r', encoding="utf8") as file:
configs = json.load(file)
else:
configs = config
return {name: ChainConfig(**v) for name, v in configs.items()}
def load_phase_configs(config) -> Dict[str, PhaseConfig]:
if isinstance(config, str):
with open(config, 'r', encoding="utf8") as file:
configs = json.load(file)
else:
configs = config
return {name: PhaseConfig(**v) for name, v in configs.items()}

View File

@ -1,114 +0,0 @@
from pydantic import BaseModel
from typing import List, Union
from loguru import logger
from .message import Message
from dev_opsgpt.utils.common_utils import (
save_to_jsonl_file, save_to_json_file, read_json_file, read_jsonl_file
)
class Memory(BaseModel):
messages: List[Message] = []
# def __init__(self, messages: List[Message] = []):
# self.messages = messages
def append(self, message: Message):
self.messages.append(message)
def extend(self, memory: 'Memory'):
self.messages.extend(memory.messages)
def update(self, role_name: str, role_type: str, role_content: str):
self.messages.append(Message(role_name, role_type, role_content, role_content))
def clear(self, ):
self.messages = []
def delete(self, ):
pass
def get_messages(self, k=0) -> List[Message]:
"""Return the most recent k memories, return all when k=0"""
return self.messages[-k:]
def save(self, file_type="jsonl", return_all=True):
try:
if file_type == "jsonl":
save_to_jsonl_file(self.to_dict_messages(return_all=return_all), "role_name_history"+f".{file_type}")
return True
elif file_type in ["json", "txt"]:
save_to_json_file(self.to_dict_messages(return_all=return_all), "role_name_history"+f".{file_type}")
return True
except:
return False
return False
def load(self, filepath):
file_type = filepath
try:
if file_type == "jsonl":
self.messages = [Message(**message) for message in read_jsonl_file(filepath)]
return True
elif file_type in ["json", "txt"]:
self.messages = [Message(**message) for message in read_jsonl_file(filepath)]
return True
except:
return False
return False
def to_tuple_messages(self, return_all: bool = True, content_key="role_content", filter_roles=[]):
# logger.debug(f"{[message.to_tuple_message(return_all, content_key) for message in self.messages ]}")
return [
message.to_tuple_message(return_all, content_key) for message in self.messages
if message.role_name not in filter_roles
]
def to_dict_messages(self, return_all: bool = True, content_key="role_content", filter_roles=[]):
return [
message.to_dict_message(return_all, content_key) for message in self.messages
if message.role_name not in filter_roles
]
def to_str_messages(self, return_all: bool = True, content_key="role_content", filter_roles=[]):
# for message in self.messages:
# logger.debug(f"{message.to_tuple_message(return_all, content_key)}")
# logger.debug(f"{[message.to_tuple_message(return_all, content_key) for message in self.messages ]}")
return "\n\n".join([message.to_str_content(return_all, content_key) for message in self.messages
if message.role_name not in filter_roles
])
def get_parserd_output(self, ):
return [message.parsed_output for message in self.messages]
def get_parserd_output_list(self, ):
# for message in self.messages:
# logger.debug(f"{message.role_name}: {message.parsed_output_list}")
return [parsed_output for message in self.messages for parsed_output in message.parsed_output_list[1:]]
def get_rolenames(self, ):
''''''
return [message.role_name for message in self.messages]
@classmethod
def from_memory_list(cls, memorys: List['Memory']) -> 'Memory':
return cls(messages=[message for memory in memorys for message in memory.get_messages()])
def __len__(self, ):
return len(self.messages)
def __str__(self) -> str:
return "\n".join([":".join(i) for i in self.to_tuple_messages()])
def __add__(self, other: Union[Message, 'Memory']) -> 'Memory':
if isinstance(other, Message):
return Memory(messages=self.messages + [other])
elif isinstance(other, Memory):
return Memory(messages=self.messages + other.messages)
else:
raise ValueError(f"cant add unspecified type like as {type(other)}")

View File

@ -1,96 +0,0 @@
from pydantic import BaseModel
from loguru import logger
from .general_schema import *
class Message(BaseModel):
chat_index: str = None
role_name: str
role_type: str
role_prompt: str = None
input_query: str = None
origin_query: str = None
# llm output
role_content: str = None
step_content: str = None
# llm parsed information
plans: List[str] = None
code_content: str = None
code_filename: str = None
tool_params: str = None
tool_name: str = None
parsed_output: dict = {}
spec_parsed_output: dict = {}
parsed_output_list: List[Dict] = []
# llm\tool\code executre information
action_status: str = ActionStatus.DEFAUILT
agent_index: int = None
code_answer: str = None
tool_answer: str = None
observation: str = None
figures: Dict[str, str] = {}
# prompt support information
tools: List[BaseTool] = []
task: Task = None
db_docs: List['Doc'] = []
code_docs: List['CodeDoc'] = []
search_docs: List['Doc'] = []
agents: List = []
# phase input
phase_name: str = None
chain_name: str = None
do_search: bool = False
doc_engine_name: str = None
code_engine_name: str = None
cb_search_type: str = None
search_engine_name: str = None
top_k: int = 3
score_threshold: float = 1.0
do_doc_retrieval: bool = False
do_code_retrieval: bool = False
do_tool_retrieval: bool = False
history_node_list: List[str] = []
# user's customed kargs for init or end action
customed_kargs: dict = {}
def to_tuple_message(self, return_all: bool = True, content_key="role_content"):
role_content = self.to_str_content(False, content_key)
if return_all:
return (self.role_name, role_content)
else:
return (role_content)
def to_dict_message(self, return_all: bool = True, content_key="role_content"):
role_content = self.to_str_content(False, content_key)
if return_all:
return {"role": self.role_name, "content": role_content}
else:
return vars(self)
def to_str_content(self, return_all: bool = True, content_key="role_content"):
if content_key == "role_content":
role_content = self.role_content or self.input_query
elif content_key == "step_content":
role_content = self.step_content or self.role_content or self.input_query
else:
role_content = self.role_content or self.input_query
if return_all:
return f"{self.role_name}: {role_content}"
else:
return role_content
def is_system_role(self,):
return self.role_type == "system"
def __str__(self) -> str:
# key_str = '\n'.join([k for k, v in vars(self).items()])
# logger.debug(f"{key_str}")
return "\n".join([": ".join([k, str(v)]) for k, v in vars(self).items()])

View File

@ -1,52 +0,0 @@
import re
def parse_section(text, section_name):
# Define a pattern to extract the named section along with its content
section_pattern = rf'#### {section_name}\n(.*?)(?=####|$)'
# Find the specific section content
section_content = re.search(section_pattern, text, re.DOTALL)
if section_content:
# If the section is found, extract the content
content = section_content.group(1)
# Define a pattern to find segments that follow the format **xx:**
segments_pattern = r'\*\*([^*]+):\*\*'
# Use findall method to extract all matches in the section content
segments = re.findall(segments_pattern, content)
return segments
else:
# If the section is not found, return an empty list
return []
def prompt_cost(model_type: str, num_prompt_tokens: float, num_completion_tokens: float):
input_cost_map = {
"gpt-3.5-turbo": 0.0015,
"gpt-3.5-turbo-16k": 0.003,
"gpt-3.5-turbo-0613": 0.0015,
"gpt-3.5-turbo-16k-0613": 0.003,
"gpt-4": 0.03,
"gpt-4-0613": 0.03,
"gpt-4-32k": 0.06,
}
output_cost_map = {
"gpt-3.5-turbo": 0.002,
"gpt-3.5-turbo-16k": 0.004,
"gpt-3.5-turbo-0613": 0.002,
"gpt-3.5-turbo-16k-0613": 0.004,
"gpt-4": 0.06,
"gpt-4-0613": 0.06,
"gpt-4-32k": 0.12,
}
if model_type not in input_cost_map or model_type not in output_cost_map:
return -1
return num_prompt_tokens * input_cost_map[model_type] / 1000.0 + num_completion_tokens * output_cost_map[model_type] / 1000.0

View File

@ -1,7 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: __init__.py.py
@time: 2023/11/16 下午3:15
@desc:
'''

View File

@ -1,7 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: __init__.py.py
@time: 2023/11/20 下午3:07
@desc:
'''

View File

@ -1,270 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: nebula_handler.py
@time: 2023/11/16 下午3:15
@desc:
'''
import time
from loguru import logger
from nebula3.gclient.net import ConnectionPool
from nebula3.Config import Config
class NebulaHandler:
def __init__(self, host: str, port: int, username: str, password: str = '', space_name: str = ''):
'''
init nebula connection_pool
@param host: host
@param port: port
@param username: username
@param password: password
'''
config = Config()
self.connection_pool = ConnectionPool()
self.connection_pool.init([(host, port)], config)
self.username = username
self.password = password
self.space_name = space_name
def execute_cypher(self, cypher: str, space_name: str = '', format_res: bool = False, use_space_name: bool = True):
'''
@param space_name: space_name, if provided, will execute use space_name first
@param cypher:
@return:
'''
with self.connection_pool.session_context(self.username, self.password) as session:
if use_space_name:
if space_name:
cypher = f'USE {space_name};{cypher}'
elif self.space_name:
cypher = f'USE {self.space_name};{cypher}'
logger.debug(cypher)
resp = session.execute(cypher)
if format_res:
resp = self.result_to_dict(resp)
return resp
def close_connection(self):
self.connection_pool.close()
def create_space(self, space_name: str, vid_type: str, comment: str = ''):
'''
create space
@param space_name: cannot startwith number
@return:
'''
cypher = f'CREATE SPACE IF NOT EXISTS {space_name} (vid_type={vid_type}) comment="{comment}";'
resp = self.execute_cypher(cypher, use_space_name=False)
return resp
def show_space(self):
cypher = 'SHOW SPACES'
resp = self.execute_cypher(cypher)
return resp
def drop_space(self, space_name):
cypher = f'DROP SPACE {space_name}'
return self.execute_cypher(cypher)
def create_tag(self, tag_name: str, prop_dict: dict = {}):
'''
创建 tag
@param tag_name: tag 名称
@param prop_dict: 属性字典 {'prop 名字': 'prop 类型'}
@return:
'''
cypher = f'CREATE TAG IF NOT EXISTS {tag_name}'
cypher += '('
for k, v in prop_dict.items():
cypher += f'{k} {v},'
cypher = cypher.rstrip(',')
cypher += ')'
cypher += ';'
res = self.execute_cypher(cypher, self.space_name)
return res
def show_tags(self):
'''
查看 tag
@return:
'''
cypher = 'SHOW TAGS'
resp = self.execute_cypher(cypher, self.space_name)
return resp
def insert_vertex(self, tag_name: str, value_dict: dict):
'''
insert vertex
@param tag_name:
@param value_dict: {'properties_name': [], values: {'vid':[]}} order should be the same in properties_name and values
@return:
'''
cypher = f'INSERT VERTEX {tag_name} ('
properties_name = value_dict['properties_name']
for property_name in properties_name:
cypher += f'{property_name},'
cypher = cypher.rstrip(',')
cypher += ') VALUES '
for vid, properties in value_dict['values'].items():
cypher += f'"{vid}":('
for property in properties:
if type(property) == str:
cypher += f'"{property}",'
else:
cypher += f'{property}'
cypher = cypher.rstrip(',')
cypher += '),'
cypher = cypher.rstrip(',')
cypher += ';'
res = self.execute_cypher(cypher, self.space_name)
return res
def create_edge_type(self, edge_type_name: str, prop_dict: dict = {}):
'''
创建 tag
@param edge_type_name: tag 名称
@param prop_dict: 属性字典 {'prop 名字': 'prop 类型'}
@return:
'''
cypher = f'CREATE EDGE IF NOT EXISTS {edge_type_name}'
cypher += '('
for k, v in prop_dict.items():
cypher += f'{k} {v},'
cypher = cypher.rstrip(',')
cypher += ')'
cypher += ';'
res = self.execute_cypher(cypher, self.space_name)
return res
def show_edge_type(self):
'''
查看 tag
@return:
'''
cypher = 'SHOW EDGES'
resp = self.execute_cypher(cypher, self.space_name)
return resp
def drop_edge_type(self, edge_type_name: str):
cypher = f'DROP EDGE {edge_type_name}'
return self.execute_cypher(cypher, self.space_name)
def insert_edge(self, edge_type_name: str, value_dict: dict):
'''
insert edge
@param edge_type_name:
@param value_dict: value_dict: {'properties_name': [], values: {(src_vid, dst_vid):[]}} order should be the
same in properties_name and values
@return:
'''
cypher = f'INSERT EDGE {edge_type_name} ('
properties_name = value_dict['properties_name']
for property_name in properties_name:
cypher += f'{property_name},'
cypher = cypher.rstrip(',')
cypher += ') VALUES '
for (src_vid, dst_vid), properties in value_dict['values'].items():
cypher += f'"{src_vid}"->"{dst_vid}":('
for property in properties:
if type(property) == str:
cypher += f'"{property}",'
else:
cypher += f'{property}'
cypher = cypher.rstrip(',')
cypher += '),'
cypher = cypher.rstrip(',')
cypher += ';'
res = self.execute_cypher(cypher, self.space_name)
return res
def set_space_name(self, space_name):
self.space_name = space_name
def add_host(self, host: str, port: str):
'''
add host
@return:
'''
cypher = f'ADD HOSTS {host}:{port}'
res = self.execute_cypher(cypher)
return res
def get_stat(self):
'''
@return:
'''
submit_cypher = 'SUBMIT JOB STATS;'
self.execute_cypher(cypher=submit_cypher, space_name=self.space_name)
time.sleep(2)
stats_cypher = 'SHOW STATS;'
stats_res = self.execute_cypher(cypher=stats_cypher, space_name=self.space_name)
res = {'vertices': -1, 'edges': -1}
stats_res_dict = self.result_to_dict(stats_res)
for idx in range(len(stats_res_dict['Type'])):
t = stats_res_dict['Type'][idx].as_string()
name = stats_res_dict['Name'][idx].as_string()
count = stats_res_dict['Count'][idx].as_int()
if t == 'Space' and name in res:
res[name] = count
return res
def get_vertices(self, tag_name: str = '', limit: int = 10000):
'''
get all vertices
@return:
'''
if tag_name:
cypher = f'''MATCH (v:{tag_name}) RETURN v LIMIT {limit};'''
else:
cypher = f'MATCH (v) RETURN v LIMIT {limit};'
res = self.execute_cypher(cypher, self.space_name)
return self.result_to_dict(res)
def result_to_dict(self, result) -> dict:
"""
build list for each column, and transform to dataframe
"""
# logger.info(result.error_msg())
assert result.is_succeeded()
columns = result.keys()
d = {}
for col_num in range(result.col_size()):
col_name = columns[col_num]
col_list = result.column_values(col_name)
d[col_name] = [x for x in col_list]
return d

View File

@ -1,7 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: __init__.py.py
@time: 2023/11/20 下午3:08
@desc:
'''

View File

@ -1,140 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: chroma_handler.py
@time: 2023/11/21 下午12:21
@desc:
'''
from loguru import logger
import chromadb
class ChromaHandler:
def __init__(self, path: str, collection_name: str = ''):
'''
init client
@param path: path of data
@collection_name: name of collection
'''
self.client = chromadb.PersistentClient(path)
self.client.heartbeat()
if collection_name:
self.collection = self.client.get_or_create_collection(name=collection_name)
def create_collection(self, collection_name: str):
'''
create collection, if exists, will override
@return:
'''
try:
collection = self.client.create_collection(name=collection_name)
except Exception as e:
return {'result_code': -1, 'msg': f'fail, error={e}'}
return {'result_code': 0, 'msg': 'success'}
def delete_collection(self, collection_name: str):
'''
@param collection_name:
@return:
'''
try:
self.client.delete_collection(name=collection_name)
except Exception as e:
return {'result_code': -1, 'msg': f'fail, error={e}'}
return {'result_code': 0, 'msg': 'success'}
def set_collection(self, collection_name: str):
'''
@param collection_name:
@return:
'''
try:
self.collection = self.client.get_collection(collection_name)
except Exception as e:
return {'result_code': -1, 'msg': f'fail, error={e}'}
return {'result_code': 0, 'msg': 'success'}
def add_data(self, ids: list, documents: list = None, embeddings: list = None, metadatas: list = None):
'''
add data to chroma
@param documents: list of doc string
@param embeddings: list of vector
@param metadatas: list of metadata
@param ids: list of id
@return:
'''
try:
self.collection.add(
ids=ids,
embeddings=embeddings,
metadatas=metadatas,
documents=documents
)
except Exception as e:
return {'result_code': -1, 'msg': f'fail, error={e}'}
return {'result_code': 0, 'msg': 'success'}
def query(self, query_embeddings=None, query_texts=None, n_results=10, where=None, where_document=None,
include=["metadatas", "documents", "distances"]):
'''
@param query_embeddings:
@param query_texts:
@param n_results:
@param where:
@param where_document:
@param include:
@return:
'''
try:
query_result = self.collection.query(query_embeddings=query_embeddings, query_texts=query_texts,
n_results=n_results, where=where, where_document=where_document,
include=include)
return {'result_code': 0, 'msg': 'success', 'result': query_result}
except Exception as e:
return {'result_code': -1, 'msg': f'fail, error={e}'}
def get(self, ids=None, where=None, limit=None, offset=None, where_document=None, include=["metadatas", "documents"]):
'''
get by condition
@param ids:
@param where:
@param limit:
@param offset:
@param where_document:
@param include:
@return:
'''
try:
query_result = self.collection.get(ids=ids, where=where, where_document=where_document,
limit=limit,
offset=offset, include=include)
return {'result_code': 0, 'msg': 'success', 'result': query_result}
except Exception as e:
return {'result_code': -1, 'msg': f'fail, error={e}'}
def peek(self, limit: int=10):
'''
peek
@param limit:
@return:
'''
try:
query_result = self.collection.peek(limit)
return {'result_code': 0, 'msg': 'success', 'result': query_result}
except Exception as e:
return {'result_code': -1, 'msg': f'fail, error={e}'}
def count(self):
'''
count
@return:
'''
try:
query_result = self.collection.count()
return {'result_code': 0, 'msg': 'success', 'result': query_result}
except Exception as e:
return {'result_code': -1, 'msg': f'fail, error={e}'}

View File

@ -1,6 +0,0 @@
from .json_loader import JSONLoader
from .jsonl_loader import JSONLLoader
__all__ = [
"JSONLoader", "JSONLLoader"
]

View File

@ -1,61 +0,0 @@
import json
from pathlib import Path
from typing import AnyStr, Callable, Dict, List, Optional, Union
from langchain.docstore.document import Document
from langchain.document_loaders.base import BaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter, TextSplitter
from dev_opsgpt.utils.common_utils import read_json_file
class JSONLoader(BaseLoader):
def __init__(
self,
file_path: Union[str, Path],
schema_key: str = "all_text",
content_key: Optional[str] = None,
metadata_func: Optional[Callable[[Dict, Dict], Dict]] = None,
text_content: bool = True,
):
self.file_path = Path(file_path).resolve()
self.schema_key = schema_key
self._content_key = content_key
self._metadata_func = metadata_func
self._text_content = text_content
def load(self, ) -> List[Document]:
"""Load and return documents from the JSON file."""
docs: List[Document] = []
datas = read_json_file(self.file_path)
self._parse(datas, docs)
return docs
def _parse(self, datas: List, docs: List[Document]) -> None:
for idx, sample in enumerate(datas):
metadata = dict(
source=str(self.file_path),
seq_num=idx,
)
text = sample.get(self.schema_key, "")
docs.append(Document(page_content=text, metadata=metadata))
def load_and_split(
self, text_splitter: Optional[TextSplitter] = None
) -> List[Document]:
"""Load Documents and split into chunks. Chunks are returned as Documents.
Args:
text_splitter: TextSplitter instance to use for splitting documents.
Defaults to RecursiveCharacterTextSplitter.
Returns:
List of Documents.
"""
if text_splitter is None:
_text_splitter: TextSplitter = RecursiveCharacterTextSplitter()
else:
_text_splitter = text_splitter
docs = self.load()
return _text_splitter.split_documents(docs)

View File

@ -1,62 +0,0 @@
import json
from pathlib import Path
from typing import AnyStr, Callable, Dict, List, Optional, Union
from langchain.docstore.document import Document
from langchain.document_loaders.base import BaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter, TextSplitter
from dev_opsgpt.utils.common_utils import read_jsonl_file
class JSONLLoader(BaseLoader):
def __init__(
self,
file_path: Union[str, Path],
schema_key: str = "all_text",
content_key: Optional[str] = None,
metadata_func: Optional[Callable[[Dict, Dict], Dict]] = None,
text_content: bool = True,
):
self.file_path = Path(file_path).resolve()
self.schema_key = schema_key
self._content_key = content_key
self._metadata_func = metadata_func
self._text_content = text_content
def load(self, ) -> List[Document]:
"""Load and return documents from the JSON file."""
docs: List[Document] = []
datas = read_jsonl_file(self.file_path)
self._parse(datas, docs)
return docs
def _parse(self, datas: List, docs: List[Document]) -> None:
for idx, sample in enumerate(datas):
metadata = dict(
source=str(self.file_path),
seq_num=idx,
)
text = sample.get(self.schema_key, "")
docs.append(Document(page_content=text, metadata=metadata))
def load_and_split(
self, text_splitter: Optional[TextSplitter] = None
) -> List[Document]:
"""Load Documents and split into chunks. Chunks are returned as Documents.
Args:
text_splitter: TextSplitter instance to use for splitting documents.
Defaults to RecursiveCharacterTextSplitter.
Returns:
List of Documents.
"""
if text_splitter is None:
_text_splitter: TextSplitter = RecursiveCharacterTextSplitter()
else:
_text_splitter = text_splitter
docs = self.load()
return _text_splitter.split_documents(docs)

View File

@ -1,37 +0,0 @@
from typing import List
from langchain.embeddings.base import Embeddings
from langchain.schema import Document
class BaseVSCService:
def do_create_kb(self):
pass
def do_drop_kb(self):
pass
def do_add_doc(self, docs: List[Document], embeddings: Embeddings):
pass
def do_clear_vs(self):
pass
def vs_type(self) -> str:
return "default"
def do_init(self):
pass
def do_search(self):
pass
def do_insert_multi_knowledge(self):
pass
def do_insert_one_knowledge(self):
pass
def do_delete_doc(self):
pass

View File

@ -1,776 +0,0 @@
"""Wrapper around FAISS vector database."""
from __future__ import annotations
import operator
import os
import pickle
import uuid
import warnings
from pathlib import Path
from typing import (
Any,
Callable,
Dict,
Iterable,
List,
Optional,
Sized,
Tuple,
)
import numpy as np
from langchain.docstore.base import AddableMixin, Docstore
from langchain.docstore.document import Document
from langchain.docstore.in_memory import InMemoryDocstore
from langchain.embeddings.base import Embeddings
from langchain.vectorstores.base import VectorStore
from langchain.vectorstores.utils import DistanceStrategy, maximal_marginal_relevance
def dependable_faiss_import(no_avx2: Optional[bool] = None) -> Any:
"""
Import faiss if available, otherwise raise error.
If FAISS_NO_AVX2 environment variable is set, it will be considered
to load FAISS with no AVX2 optimization.
Args:
no_avx2: Load FAISS strictly with no AVX2 optimization
so that the vectorstore is portable and compatible with other devices.
"""
if no_avx2 is None and "FAISS_NO_AVX2" in os.environ:
no_avx2 = bool(os.getenv("FAISS_NO_AVX2"))
try:
if no_avx2:
from faiss import swigfaiss as faiss
else:
import faiss
except ImportError:
raise ImportError(
"Could not import faiss python package. "
"Please install it with `pip install faiss-gpu` (for CUDA supported GPU) "
"or `pip install faiss-cpu` (depending on Python version)."
)
return faiss
def _len_check_if_sized(x: Any, y: Any, x_name: str, y_name: str) -> None:
if isinstance(x, Sized) and isinstance(y, Sized) and len(x) != len(y):
raise ValueError(
f"{x_name} and {y_name} expected to be equal length but "
f"len({x_name})={len(x)} and len({y_name})={len(y)}"
)
return
class FAISS(VectorStore):
"""Wrapper around FAISS vector database.
To use, you must have the ``faiss`` python package installed.
Example:
.. code-block:: python
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
embeddings = OpenAIEmbeddings()
texts = ["FAISS is an important library", "LangChain supports FAISS"]
faiss = FAISS.from_texts(texts, embeddings)
"""
def __init__(
self,
embedding_function: Callable,
index: Any,
docstore: Docstore,
index_to_docstore_id: Dict[int, str],
relevance_score_fn: Optional[Callable[[float], float]] = None,
normalize_L2: bool = False,
distance_strategy: DistanceStrategy = DistanceStrategy.EUCLIDEAN_DISTANCE,
):
"""Initialize with necessary components."""
self.embedding_function = embedding_function
self.index = index
self.docstore = docstore
self.index_to_docstore_id = index_to_docstore_id
self.distance_strategy = distance_strategy
self.override_relevance_score_fn = relevance_score_fn
self._normalize_L2 = normalize_L2
if (
self.distance_strategy != DistanceStrategy.EUCLIDEAN_DISTANCE
and self._normalize_L2
):
warnings.warn(
"Normalizing L2 is not applicable for metric type: {strategy}".format(
strategy=self.distance_strategy
)
)
def __add(
self,
texts: Iterable[str],
embeddings: Iterable[List[float]],
metadatas: Optional[Iterable[dict]] = None,
ids: Optional[List[str]] = None,
) -> List[str]:
faiss = dependable_faiss_import()
if not isinstance(self.docstore, AddableMixin):
raise ValueError(
"If trying to add texts, the underlying docstore should support "
f"adding items, which {self.docstore} does not"
)
_len_check_if_sized(texts, metadatas, "texts", "metadatas")
_metadatas = metadatas or ({} for _ in texts)
documents = [
Document(page_content=t, metadata=m) for t, m in zip(texts, _metadatas)
]
_len_check_if_sized(documents, embeddings, "documents", "embeddings")
_len_check_if_sized(documents, ids, "documents", "ids")
# Add to the index.
vector = np.array(embeddings, dtype=np.float32)
if self._normalize_L2:
faiss.normalize_L2(vector)
self.index.add(vector)
# Add information to docstore and index.
ids = ids or [str(uuid.uuid4()) for _ in texts]
self.docstore.add({id_: doc for id_, doc in zip(ids, documents)})
starting_len = len(self.index_to_docstore_id)
index_to_id = {starting_len + j: id_ for j, id_ in enumerate(ids)}
self.index_to_docstore_id.update(index_to_id)
return ids
def add_texts(
self,
texts: Iterable[str],
metadatas: Optional[List[dict]] = None,
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> List[str]:
"""Run more texts through the embeddings and add to the vectorstore.
Args:
texts: Iterable of strings to add to the vectorstore.
metadatas: Optional list of metadatas associated with the texts.
ids: Optional list of unique IDs.
Returns:
List of ids from adding the texts into the vectorstore.
"""
# embeddings = [self.embedding_function(text) for text in texts]
embeddings = self.embedding_function(texts)
return self.__add(texts, embeddings, metadatas=metadatas, ids=ids)
def add_embeddings(
self,
text_embeddings: Iterable[Tuple[str, List[float]]],
metadatas: Optional[List[dict]] = None,
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> List[str]:
"""Run more texts through the embeddings and add to the vectorstore.
Args:
text_embeddings: Iterable pairs of string and embedding to
add to the vectorstore.
metadatas: Optional list of metadatas associated with the texts.
ids: Optional list of unique IDs.
Returns:
List of ids from adding the texts into the vectorstore.
"""
# Embed and create the documents.
texts, embeddings = zip(*text_embeddings)
return self.__add(texts, embeddings, metadatas=metadatas, ids=ids)
def similarity_search_with_score_by_vector(
self,
embedding: List[float],
k: int = 4,
filter: Optional[Dict[str, Any]] = None,
fetch_k: int = 20,
**kwargs: Any,
) -> List[Tuple[Document, float]]:
"""Return docs most similar to query.
Args:
embedding: Embedding vector to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
filter (Optional[Dict[str, Any]]): Filter by metadata. Defaults to None.
fetch_k: (Optional[int]) Number of Documents to fetch before filtering.
Defaults to 20.
**kwargs: kwargs to be passed to similarity search. Can include:
score_threshold: Optional, a floating point value between 0 to 1 to
filter the resulting set of retrieved docs
Returns:
List of documents most similar to the query text and L2 distance
in float for each. Lower score represents more similarity.
"""
faiss = dependable_faiss_import()
vector = np.array([embedding], dtype=np.float32)
if self._normalize_L2:
faiss.normalize_L2(vector)
scores, indices = self.index.search(vector, k if filter is None else fetch_k)
docs = []
for j, i in enumerate(indices[0]):
if i == -1:
# This happens when not enough docs are returned.
continue
_id = self.index_to_docstore_id[i]
doc = self.docstore.search(_id)
if not isinstance(doc, Document):
raise ValueError(f"Could not find document for id {_id}, got {doc}")
if filter is not None:
filter = {
key: [value] if not isinstance(value, list) else value
for key, value in filter.items()
}
if all(doc.metadata.get(key) in value for key, value in filter.items()):
docs.append((doc, scores[0][j]))
else:
docs.append((doc, scores[0][j]))
score_threshold = kwargs.get("score_threshold")
if score_threshold is not None:
cmp = (
operator.ge
if self.distance_strategy
in (DistanceStrategy.MAX_INNER_PRODUCT, DistanceStrategy.JACCARD)
else operator.le
)
docs = [
(doc, similarity)
for doc, similarity in docs
if cmp(similarity, score_threshold)
]
return docs[:k]
def similarity_search_with_score(
self,
query: str,
k: int = 4,
filter: Optional[Dict[str, Any]] = None,
fetch_k: int = 20,
**kwargs: Any,
) -> List[Tuple[Document, float]]:
"""Return docs most similar to query.
Args:
query: Text to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
fetch_k: (Optional[int]) Number of Documents to fetch before filtering.
Defaults to 20.
Returns:
List of documents most similar to the query text with
L2 distance in float. Lower score represents more similarity.
"""
embedding = self.embedding_function(query)
docs = self.similarity_search_with_score_by_vector(
embedding,
k,
filter=filter,
fetch_k=fetch_k,
**kwargs,
)
return docs
def similarity_search_by_vector(
self,
embedding: List[float],
k: int = 4,
filter: Optional[Dict[str, Any]] = None,
fetch_k: int = 20,
**kwargs: Any,
) -> List[Document]:
"""Return docs most similar to embedding vector.
Args:
embedding: Embedding to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
fetch_k: (Optional[int]) Number of Documents to fetch before filtering.
Defaults to 20.
Returns:
List of Documents most similar to the embedding.
"""
docs_and_scores = self.similarity_search_with_score_by_vector(
embedding,
k,
filter=filter,
fetch_k=fetch_k,
**kwargs,
)
return [doc for doc, _ in docs_and_scores]
def similarity_search(
self,
query: str,
k: int = 4,
filter: Optional[Dict[str, Any]] = None,
fetch_k: int = 20,
**kwargs: Any,
) -> List[Document]:
"""Return docs most similar to query.
Args:
query: Text to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
filter: (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
fetch_k: (Optional[int]) Number of Documents to fetch before filtering.
Defaults to 20.
Returns:
List of Documents most similar to the query.
"""
docs_and_scores = self.similarity_search_with_score(
query, k, filter=filter, fetch_k=fetch_k, **kwargs
)
return [doc for doc, _ in docs_and_scores]
def max_marginal_relevance_search_with_score_by_vector(
self,
embedding: List[float],
*,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
filter: Optional[Dict[str, Any]] = None,
) -> List[Tuple[Document, float]]:
"""Return docs and their similarity scores selected using the maximal marginal
relevance.
Maximal marginal relevance optimizes for similarity to query AND diversity
among selected documents.
Args:
embedding: Embedding to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
fetch_k: Number of Documents to fetch before filtering to
pass to MMR algorithm.
lambda_mult: Number between 0 and 1 that determines the degree
of diversity among the results with 0 corresponding
to maximum diversity and 1 to minimum diversity.
Defaults to 0.5.
Returns:
List of Documents and similarity scores selected by maximal marginal
relevance and score for each.
"""
scores, indices = self.index.search(
np.array([embedding], dtype=np.float32),
fetch_k if filter is None else fetch_k * 2,
)
if filter is not None:
filtered_indices = []
for i in indices[0]:
if i == -1:
# This happens when not enough docs are returned.
continue
_id = self.index_to_docstore_id[i]
doc = self.docstore.search(_id)
if not isinstance(doc, Document):
raise ValueError(f"Could not find document for id {_id}, got {doc}")
if all(
doc.metadata.get(key) in value
if isinstance(value, list)
else doc.metadata.get(key) == value
for key, value in filter.items()
):
filtered_indices.append(i)
indices = np.array([filtered_indices])
# -1 happens when not enough docs are returned.
embeddings = [self.index.reconstruct(int(i)) for i in indices[0] if i != -1]
mmr_selected = maximal_marginal_relevance(
np.array([embedding], dtype=np.float32),
embeddings,
k=k,
lambda_mult=lambda_mult,
)
selected_indices = [indices[0][i] for i in mmr_selected]
selected_scores = [scores[0][i] for i in mmr_selected]
docs_and_scores = []
for i, score in zip(selected_indices, selected_scores):
if i == -1:
# This happens when not enough docs are returned.
continue
_id = self.index_to_docstore_id[i]
doc = self.docstore.search(_id)
if not isinstance(doc, Document):
raise ValueError(f"Could not find document for id {_id}, got {doc}")
docs_and_scores.append((doc, score))
return docs_and_scores
def max_marginal_relevance_search_by_vector(
self,
embedding: List[float],
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
filter: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> List[Document]:
"""Return docs selected using the maximal marginal relevance.
Maximal marginal relevance optimizes for similarity to query AND diversity
among selected documents.
Args:
embedding: Embedding to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
fetch_k: Number of Documents to fetch before filtering to
pass to MMR algorithm.
lambda_mult: Number between 0 and 1 that determines the degree
of diversity among the results with 0 corresponding
to maximum diversity and 1 to minimum diversity.
Defaults to 0.5.
Returns:
List of Documents selected by maximal marginal relevance.
"""
docs_and_scores = self.max_marginal_relevance_search_with_score_by_vector(
embedding, k=k, fetch_k=fetch_k, lambda_mult=lambda_mult, filter=filter
)
return [doc for doc, _ in docs_and_scores]
def max_marginal_relevance_search(
self,
query: str,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
filter: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> List[Document]:
"""Return docs selected using the maximal marginal relevance.
Maximal marginal relevance optimizes for similarity to query AND diversity
among selected documents.
Args:
query: Text to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
fetch_k: Number of Documents to fetch before filtering (if needed) to
pass to MMR algorithm.
lambda_mult: Number between 0 and 1 that determines the degree
of diversity among the results with 0 corresponding
to maximum diversity and 1 to minimum diversity.
Defaults to 0.5.
Returns:
List of Documents selected by maximal marginal relevance.
"""
embedding = self.embedding_function(query)
docs = self.max_marginal_relevance_search_by_vector(
embedding,
k=k,
fetch_k=fetch_k,
lambda_mult=lambda_mult,
filter=filter,
**kwargs,
)
return docs
def delete(self, ids: Optional[List[str]] = None, **kwargs: Any) -> Optional[bool]:
"""Delete by ID. These are the IDs in the vectorstore.
Args:
ids: List of ids to delete.
Returns:
Optional[bool]: True if deletion is successful,
False otherwise, None if not implemented.
"""
if ids is None:
raise ValueError("No ids provided to delete.")
missing_ids = set(ids).difference(self.index_to_docstore_id.values())
if missing_ids:
raise ValueError(
f"Some specified ids do not exist in the current store. Ids not found: "
f"{missing_ids}"
)
reversed_index = {id_: idx for idx, id_ in self.index_to_docstore_id.items()}
index_to_delete = [reversed_index[id_] for id_ in ids]
self.index.remove_ids(np.array(index_to_delete, dtype=np.int64))
self.docstore.delete(ids)
remaining_ids = [
id_
for i, id_ in sorted(self.index_to_docstore_id.items())
if i not in index_to_delete
]
self.index_to_docstore_id = {i: id_ for i, id_ in enumerate(remaining_ids)}
return True
def merge_from(self, target: FAISS) -> None:
"""Merge another FAISS object with the current one.
Add the target FAISS to the current one.
Args:
target: FAISS object you wish to merge into the current one
Returns:
None.
"""
if not isinstance(self.docstore, AddableMixin):
raise ValueError("Cannot merge with this type of docstore")
# Numerical index for target docs are incremental on existing ones
starting_len = len(self.index_to_docstore_id)
# Merge two IndexFlatL2
self.index.merge_from(target.index)
# Get id and docs from target FAISS object
full_info = []
for i, target_id in target.index_to_docstore_id.items():
doc = target.docstore.search(target_id)
if not isinstance(doc, Document):
raise ValueError("Document should be returned")
full_info.append((starting_len + i, target_id, doc))
# Add information to docstore and index_to_docstore_id.
self.docstore.add({_id: doc for _, _id, doc in full_info})
index_to_id = {index: _id for index, _id, _ in full_info}
self.index_to_docstore_id.update(index_to_id)
@classmethod
def __from(
cls,
texts: Iterable[str],
embeddings: List[List[float]],
embedding: Embeddings,
metadatas: Optional[Iterable[dict]] = None,
ids: Optional[List[str]] = None,
normalize_L2: bool = False,
distance_strategy: DistanceStrategy = DistanceStrategy.EUCLIDEAN_DISTANCE,
**kwargs: Any,
) -> FAISS:
faiss = dependable_faiss_import()
if distance_strategy == DistanceStrategy.MAX_INNER_PRODUCT:
index = faiss.IndexFlatIP(len(embeddings[0]))
else:
# Default to L2, currently other metric types not initialized.
index = faiss.IndexFlatL2(len(embeddings[0]))
vecstore = cls(
embedding.embed_query,
index,
InMemoryDocstore(),
{},
normalize_L2=normalize_L2,
distance_strategy=distance_strategy,
**kwargs,
)
vecstore.__add(texts, embeddings, metadatas=metadatas, ids=ids)
return vecstore
@classmethod
def from_texts(
cls,
texts: List[str],
embedding: Embeddings,
metadatas: Optional[List[dict]] = None,
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> FAISS:
"""Construct FAISS wrapper from raw documents.
This is a user friendly interface that:
1. Embeds documents.
2. Creates an in memory docstore
3. Initializes the FAISS database
This is intended to be a quick way to get started.
Example:
.. code-block:: python
from langchain import FAISS
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
faiss = FAISS.from_texts(texts, embeddings)
"""
from loguru import logger
logger.debug(f"texts: {len(texts)}")
embeddings = embedding.embed_documents(texts)
return cls.__from(
texts,
embeddings,
embedding,
metadatas=metadatas,
ids=ids,
**kwargs,
)
@classmethod
def from_embeddings(
cls,
text_embeddings: Iterable[Tuple[str, List[float]]],
embedding: Embeddings,
metadatas: Optional[Iterable[dict]] = None,
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> FAISS:
"""Construct FAISS wrapper from raw documents.
This is a user friendly interface that:
1. Embeds documents.
2. Creates an in memory docstore
3. Initializes the FAISS database
This is intended to be a quick way to get started.
Example:
.. code-block:: python
from langchain import FAISS
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
text_embeddings = embeddings.embed_documents(texts)
text_embedding_pairs = zip(texts, text_embeddings)
faiss = FAISS.from_embeddings(text_embedding_pairs, embeddings)
"""
texts = [t[0] for t in text_embeddings]
embeddings = [t[1] for t in text_embeddings]
return cls.__from(
texts,
embeddings,
embedding,
metadatas=metadatas,
ids=ids,
**kwargs,
)
def save_local(self, folder_path: str, index_name: str = "index") -> None:
"""Save FAISS index, docstore, and index_to_docstore_id to disk.
Args:
folder_path: folder path to save index, docstore,
and index_to_docstore_id to.
index_name: for saving with a specific index file name
"""
path = Path(folder_path)
path.mkdir(exist_ok=True, parents=True)
# save index separately since it is not picklable
faiss = dependable_faiss_import()
faiss.write_index(
self.index, str(path / "{index_name}.faiss".format(index_name=index_name))
)
# save docstore and index_to_docstore_id
with open(path / "{index_name}.pkl".format(index_name=index_name), "wb") as f:
pickle.dump((self.docstore, self.index_to_docstore_id), f)
@classmethod
def load_local(
cls,
folder_path: str,
embeddings: Embeddings,
index_name: str = "index",
**kwargs: Any,
) -> FAISS:
"""Load FAISS index, docstore, and index_to_docstore_id from disk.
Args:
folder_path: folder path to load index, docstore,
and index_to_docstore_id from.
embeddings: Embeddings to use when generating queries
index_name: for saving with a specific index file name
"""
path = Path(folder_path)
# load index separately since it is not picklable
faiss = dependable_faiss_import()
index = faiss.read_index(
str(path / "{index_name}.faiss".format(index_name=index_name))
)
# load docstore and index_to_docstore_id
with open(path / "{index_name}.pkl".format(index_name=index_name), "rb") as f:
docstore, index_to_docstore_id = pickle.load(f)
return cls(
embeddings.embed_query, index, docstore, index_to_docstore_id, **kwargs
)
def serialize_to_bytes(self) -> bytes:
"""Serialize FAISS index, docstore, and index_to_docstore_id to bytes."""
return pickle.dumps((self.index, self.docstore, self.index_to_docstore_id))
@classmethod
def deserialize_from_bytes(
cls,
serialized: bytes,
embeddings: Embeddings,
**kwargs: Any,
) -> FAISS:
"""Deserialize FAISS index, docstore, and index_to_docstore_id from bytes."""
index, docstore, index_to_docstore_id = pickle.loads(serialized)
return cls(
embeddings.embed_query, index, docstore, index_to_docstore_id, **kwargs
)
def _select_relevance_score_fn(self) -> Callable[[float], float]:
"""
The 'correct' relevance function
may differ depending on a few things, including:
- the distance / similarity metric used by the VectorStore
- the scale of your embeddings (OpenAI's are unit normed. Many others are not!)
- embedding dimensionality
- etc.
"""
if self.override_relevance_score_fn is not None:
return self.override_relevance_score_fn
# Default strategy is to rely on distance strategy provided in
# vectorstore constructor
if self.distance_strategy == DistanceStrategy.MAX_INNER_PRODUCT:
return self._max_inner_product_relevance_score_fn
elif self.distance_strategy == DistanceStrategy.EUCLIDEAN_DISTANCE:
# Default behavior is to use euclidean distance relevancy
return self._euclidean_relevance_score_fn
else:
raise ValueError(
"Unknown distance strategy, must be cosine, max_inner_product,"
" or euclidean"
)
def _similarity_search_with_relevance_scores(
self,
query: str,
k: int = 4,
filter: Optional[Dict[str, Any]] = None,
fetch_k: int = 20,
**kwargs: Any,
) -> List[Tuple[Document, float]]:
"""Return docs and their similarity scores on a scale from 0 to 1."""
# Pop score threshold so that only relevancy scores, not raw scores, are
# filtered.
relevance_score_fn = self._select_relevance_score_fn()
if relevance_score_fn is None:
raise ValueError(
"normalize_score_fn must be provided to"
" FAISS constructor to normalize scores"
)
docs_and_scores = self.similarity_search_with_score(
query,
k=k,
filter=filter,
fetch_k=fetch_k,
**kwargs,
)
docs_and_rel_scores = [
(doc, relevance_score_fn(score)) for doc, score in docs_and_scores
]
return docs_and_rel_scores

View File

@ -1,39 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: get_embedding.py
@time: 2023/11/22 上午11:30
@desc:
'''
from loguru import logger
from configs.model_config import EMBEDDING_MODEL
from dev_opsgpt.embeddings.openai_embedding import OpenAIEmbedding
from dev_opsgpt.embeddings.huggingface_embedding import HFEmbedding
def get_embedding(engine: str, text_list: list):
'''
get embedding
@param engine: openai / hf
@param text_list:
@return:
'''
emb_res = {}
if engine == 'openai':
oae = OpenAIEmbedding()
emb_res = oae.get_emb(text_list)
elif engine == 'model':
hfe = HFEmbedding(EMBEDDING_MODEL)
emb_res = hfe.get_emb(text_list)
return emb_res
if __name__ == '__main__':
engine = 'model'
text_list = ['这段代码是一个OkHttp拦截器用于在请求头中添加授权令牌。它继承自`com.theokanning.openai.client.AuthenticationInterceptor`类,并且被标记为`@Deprecated`,意味着它已经过时了。\n\n这个拦截器的作用是在每个请求的头部添加一个名为"Authorization"的字段,值为传入的授权令牌。这样,当请求被发送到服务器时,服务器可以使用这个令牌来验证请求的合法性。\n\n这段代码的构造函数接受一个令牌作为参数,并将其传递给父类的构造函数。这个令牌应该是一个有效的授权令牌,用于访问受保护的资源。', '这段代码定义了一个接口`OpenAiApi`,并使用`@Deprecated`注解将其标记为已过时。它还扩展了`com.theokanning.openai.client.OpenAiApi`接口。\n\n`@Deprecated`注解表示该接口已经过时,不推荐使用。开发者应该使用`com.theokanning.openai.client.OpenAiApi`接口代替。\n\n注释中提到这个接口只是为了保持向后兼容性。这意味着它可能是为了与旧版本的代码兼容而保留的,但不推荐在新代码中使用。', '这段代码是一个OkHttp的拦截器用于在请求头中添加授权令牌authorization token\n\n在这个拦截器中首先获取到传入的授权令牌token然后在每个请求的构建过程中使用`newBuilder()`方法创建一个新的请求构建器,并在该构建器中添加一个名为"Authorization"的请求头,值为"Bearer " + token。最后使用该构建器构建一个新的请求并通过`chain.proceed(request)`方法继续处理该请求。\n\n这样当使用OkHttp发送请求时该拦截器会自动在请求头中添加授权令牌以实现身份验证的功能。', '这段代码是一个Java接口用于定义与OpenAI API进行通信的方法。它包含了各种不同类型的请求和响应方法用于与OpenAI API的不同端点进行交互。\n\n接口中的方法包括:\n- `listModels()`:获取可用的模型列表。\n- `getModel(String modelId)`:获取指定模型的详细信息。\n- `createCompletion(CompletionRequest request)`:创建文本生成的请求。\n- `createChatCompletion(ChatCompletionRequest request)`:创建聊天式文本生成的请求。\n- `createEdit(EditRequest request)`:创建文本编辑的请求。\n- `createEmbeddings(EmbeddingRequest request)`:创建文本嵌入的请求。\n- `listFiles()`:获取已上传文件的列表。\n- `uploadFile(RequestBody purpose, MultipartBody.Part file)`:上传文件。\n- `deleteFile(String fileId)`:删除文件。\n- `retrieveFile(String fileId)`:获取文件的详细信息。\n- `retrieveFileContent(String fileId)`:获取文件的内容。\n- `createFineTuningJob(FineTuningJobRequest request)`创建Fine-Tuning任务。\n- `listFineTuningJobs()`获取Fine-Tuning任务的列表。\n- `retrieveFineTuningJob(String fineTuningJobId)`获取指定Fine-Tuning任务的详细信息。\n- `cancelFineTuningJob(String fineTuningJobId)`取消Fine-Tuning任务。\n- `listFineTuningJobEvents(String fineTuningJobId)`获取Fine-Tuning任务的事件列表。\n- `createFineTuneCompletion(CompletionRequest request)`创建Fine-Tuning模型的文本生成请求。\n- `createImage(CreateImageRequest request)`:创建图像生成的请求。\n- `createImageEdit(RequestBody requestBody)`:创建图像编辑的请求。\n- `createImageVariation(RequestBody requestBody)`:创建图像变体的请求。\n- `createTranscription(RequestBody requestBody)`:创建音频转录的请求。\n- `createTranslation(RequestBody requestBody)`:创建音频翻译的请求。\n- `createModeration(ModerationRequest request)`:创建内容审核的请求。\n- `getEngines()`:获取可用的引擎列表。\n- `getEngine(String engineId)`:获取指定引擎的详细信息。\n- `subscription()`:获取账户订阅信息。\n- `billingUsage(LocalDate starDate, LocalDate endDate)`:获取账户消费信息。\n\n这些方法使用不同的HTTP请求类型GET、POST、DELETE和路径来与OpenAI API进行交互并返回相应的响应数据。']
res = get_embedding(engine, text_list)
logger.debug(res)

View File

@ -1,49 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: huggingface_embedding.py
@time: 2023/11/30 上午11:41
@desc:
'''
from loguru import logger
from configs.model_config import EMBEDDING_DEVICE
from dev_opsgpt.embeddings.utils import load_embeddings
class HFEmbedding:
_instance = {}
def __new__(cls, *args, **kwargs):
instance_key = f'{args},{kwargs}'
if cls._instance.get(instance_key, None):
return cls._instance[instance_key]
else:
cls._instance[instance_key] = super().__new__(cls)
return cls._instance[instance_key]
def __init__(self, model_name):
self.model = load_embeddings(model=model_name, device=EMBEDDING_DEVICE)
logger.debug('load success')
def get_emb(self, text_list):
'''
get embedding
@param text_list:
@return:
'''
logger.info('st')
emb_res = self.model.embed_documents(text_list)
logger.info('ed')
res = {
text_list[idx]: emb_res[idx] for idx in range(len(text_list))
}
return res
if __name__ == '__main__':
model_name = 'text2vec-base'
hfe = HFEmbedding(model_name)
text_list = ['这段代码是一个OkHttp拦截器用于在请求头中添加授权令牌。它继承自`com.theokanning.openai.client.AuthenticationInterceptor`类,并且被标记为`@Deprecated`,意味着它已经过时了。\n\n这个拦截器的作用是在每个请求的头部添加一个名为"Authorization"的字段,值为传入的授权令牌。这样,当请求被发送到服务器时,服务器可以使用这个令牌来验证请求的合法性。\n\n这段代码的构造函数接受一个令牌作为参数,并将其传递给父类的构造函数。这个令牌应该是一个有效的授权令牌,用于访问受保护的资源。', '这段代码定义了一个接口`OpenAiApi`,并使用`@Deprecated`注解将其标记为已过时。它还扩展了`com.theokanning.openai.client.OpenAiApi`接口。\n\n`@Deprecated`注解表示该接口已经过时,不推荐使用。开发者应该使用`com.theokanning.openai.client.OpenAiApi`接口代替。\n\n注释中提到这个接口只是为了保持向后兼容性。这意味着它可能是为了与旧版本的代码兼容而保留的,但不推荐在新代码中使用。', '这段代码是一个OkHttp的拦截器用于在请求头中添加授权令牌authorization token\n\n在这个拦截器中首先获取到传入的授权令牌token然后在每个请求的构建过程中使用`newBuilder()`方法创建一个新的请求构建器,并在该构建器中添加一个名为"Authorization"的请求头,值为"Bearer " + token。最后使用该构建器构建一个新的请求并通过`chain.proceed(request)`方法继续处理该请求。\n\n这样当使用OkHttp发送请求时该拦截器会自动在请求头中添加授权令牌以实现身份验证的功能。', '这段代码是一个Java接口用于定义与OpenAI API进行通信的方法。它包含了各种不同类型的请求和响应方法用于与OpenAI API的不同端点进行交互。\n\n接口中的方法包括:\n- `listModels()`:获取可用的模型列表。\n- `getModel(String modelId)`:获取指定模型的详细信息。\n- `createCompletion(CompletionRequest request)`:创建文本生成的请求。\n- `createChatCompletion(ChatCompletionRequest request)`:创建聊天式文本生成的请求。\n- `createEdit(EditRequest request)`:创建文本编辑的请求。\n- `createEmbeddings(EmbeddingRequest request)`:创建文本嵌入的请求。\n- `listFiles()`:获取已上传文件的列表。\n- `uploadFile(RequestBody purpose, MultipartBody.Part file)`:上传文件。\n- `deleteFile(String fileId)`:删除文件。\n- `retrieveFile(String fileId)`:获取文件的详细信息。\n- `retrieveFileContent(String fileId)`:获取文件的内容。\n- `createFineTuningJob(FineTuningJobRequest request)`创建Fine-Tuning任务。\n- `listFineTuningJobs()`获取Fine-Tuning任务的列表。\n- `retrieveFineTuningJob(String fineTuningJobId)`获取指定Fine-Tuning任务的详细信息。\n- `cancelFineTuningJob(String fineTuningJobId)`取消Fine-Tuning任务。\n- `listFineTuningJobEvents(String fineTuningJobId)`获取Fine-Tuning任务的事件列表。\n- `createFineTuneCompletion(CompletionRequest request)`创建Fine-Tuning模型的文本生成请求。\n- `createImage(CreateImageRequest request)`:创建图像生成的请求。\n- `createImageEdit(RequestBody requestBody)`:创建图像编辑的请求。\n- `createImageVariation(RequestBody requestBody)`:创建图像变体的请求。\n- `createTranscription(RequestBody requestBody)`:创建音频转录的请求。\n- `createTranslation(RequestBody requestBody)`:创建音频翻译的请求。\n- `createModeration(ModerationRequest request)`:创建内容审核的请求。\n- `getEngines()`:获取可用的引擎列表。\n- `getEngine(String engineId)`:获取指定引擎的详细信息。\n- `subscription()`:获取账户订阅信息。\n- `billingUsage(LocalDate starDate, LocalDate endDate)`:获取账户消费信息。\n\n这些方法使用不同的HTTP请求类型GET、POST、DELETE和路径来与OpenAI API进行交互并返回相应的响应数据。']
hfe.get_emb(text_list)

View File

@ -1,48 +0,0 @@
# encoding: utf-8
'''
@author: 温进
@file: openai_embedding.py
@time: 2023/11/22 上午10:45
@desc:
'''
import openai
import base64
import json
import os
from loguru import logger
class OpenAIEmbedding:
def __init__(self):
pass
def get_emb(self, text_list):
openai.api_key = os.environ["OPENAI_API_KEY"]
openai.api_base = os.environ["API_BASE_URL"]
# change , to to avoid bug
modified_text_list = [i.replace(',', '') for i in text_list]
emb_all_result = openai.Embedding.create(
model="text-embedding-ada-002",
input=modified_text_list
)
res = {}
# logger.debug(emb_all_result)
logger.debug(f'len of result={len(emb_all_result["data"])}')
for emb_result in emb_all_result['data']:
index = emb_result['index']
# logger.debug(index)
text = text_list[index]
emb = emb_result['embedding']
res[text] = emb
return res
if __name__ == '__main__':
oae = OpenAIEmbedding()
res = oae.get_emb(text_list=['这段代码是一个OkHttp拦截器用于在请求头中添加授权令牌。它继承自`com.theokanning.openai.client.AuthenticationInterceptor`类,并且被标记为`@Deprecated`,意味着它已经过时了。\n\n这个拦截器的作用是在每个请求的头部添加一个名为"Authorization"的字段,值为传入的授权令牌。这样,当请求被发送到服务器时,服务器可以使用这个令牌来验证请求的合法性。\n\n这段代码的构造函数接受一个令牌作为参数,并将其传递给父类的构造函数。这个令牌应该是一个有效的授权令牌,用于访问受保护的资源。', '这段代码定义了一个接口`OpenAiApi`,并使用`@Deprecated`注解将其标记为已过时。它还扩展了`com.theokanning.openai.client.OpenAiApi`接口。\n\n`@Deprecated`注解表示该接口已经过时,不推荐使用。开发者应该使用`com.theokanning.openai.client.OpenAiApi`接口代替。\n\n注释中提到这个接口只是为了保持向后兼容性。这意味着它可能是为了与旧版本的代码兼容而保留的,但不推荐在新代码中使用。', '这段代码是一个OkHttp的拦截器用于在请求头中添加授权令牌authorization token\n\n在这个拦截器中首先获取到传入的授权令牌token然后在每个请求的构建过程中使用`newBuilder()`方法创建一个新的请求构建器,并在该构建器中添加一个名为"Authorization"的请求头,值为"Bearer " + token。最后使用该构建器构建一个新的请求并通过`chain.proceed(request)`方法继续处理该请求。\n\n这样当使用OkHttp发送请求时该拦截器会自动在请求头中添加授权令牌以实现身份验证的功能。', '这段代码是一个Java接口用于定义与OpenAI API进行通信的方法。它包含了各种不同类型的请求和响应方法用于与OpenAI API的不同端点进行交互。\n\n接口中的方法包括:\n- `listModels()`:获取可用的模型列表。\n- `getModel(String modelId)`:获取指定模型的详细信息。\n- `createCompletion(CompletionRequest request)`:创建文本生成的请求。\n- `createChatCompletion(ChatCompletionRequest request)`:创建聊天式文本生成的请求。\n- `createEdit(EditRequest request)`:创建文本编辑的请求。\n- `createEmbeddings(EmbeddingRequest request)`:创建文本嵌入的请求。\n- `listFiles()`:获取已上传文件的列表。\n- `uploadFile(RequestBody purpose, MultipartBody.Part file)`:上传文件。\n- `deleteFile(String fileId)`:删除文件。\n- `retrieveFile(String fileId)`:获取文件的详细信息。\n- `retrieveFileContent(String fileId)`:获取文件的内容。\n- `createFineTuningJob(FineTuningJobRequest request)`创建Fine-Tuning任务。\n- `listFineTuningJobs()`获取Fine-Tuning任务的列表。\n- `retrieveFineTuningJob(String fineTuningJobId)`获取指定Fine-Tuning任务的详细信息。\n- `cancelFineTuningJob(String fineTuningJobId)`取消Fine-Tuning任务。\n- `listFineTuningJobEvents(String fineTuningJobId)`获取Fine-Tuning任务的事件列表。\n- `createFineTuneCompletion(CompletionRequest request)`创建Fine-Tuning模型的文本生成请求。\n- `createImage(CreateImageRequest request)`:创建图像生成的请求。\n- `createImageEdit(RequestBody requestBody)`:创建图像编辑的请求。\n- `createImageVariation(RequestBody requestBody)`:创建图像变体的请求。\n- `createTranscription(RequestBody requestBody)`:创建音频转录的请求。\n- `createTranslation(RequestBody requestBody)`:创建音频翻译的请求。\n- `createModeration(ModerationRequest request)`:创建内容审核的请求。\n- `getEngines()`:获取可用的引擎列表。\n- `getEngine(String engineId)`:获取指定引擎的详细信息。\n- `subscription()`:获取账户订阅信息。\n- `billingUsage(LocalDate starDate, LocalDate endDate)`:获取账户消费信息。\n\n这些方法使用不同的HTTP请求类型GET、POST、DELETE和路径来与OpenAI API进行交互并返回相应的响应数据。'])
# res = oae.get_emb(text_list=['''test1"test2test3''', '''test4test5test6'''])
print(res)

View File

@ -1,14 +0,0 @@
from functools import lru_cache
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from configs.model_config import embedding_model_dict
from loguru import logger
@lru_cache(1)
def load_embeddings(model: str, device: str):
logger.info("load embedding model: {}, {}".format(model, embedding_model_dict[model]))
embeddings = HuggingFaceEmbeddings(model_name=embedding_model_dict[model],
model_kwargs={'device': device})
return embeddings

View File

@ -1,6 +0,0 @@
from .openai_model import getChatModel, getExtraModel
__all__ = [
"getChatModel", "getExtraModel"
]

Some files were not shown because too many files have changed in this diff Show More