🚀 令牌范围
TokenScope 是一款用于高效分析代码库并与大语言模型(LLM)交互的工具。它整合了 FastMCP 和 tiktoken,旨在借助大型语言模型对代码库开展高效的上下文分析。
🚀 快速开始
以下是一个简单的使用示例:
from tokenscope import create_client
client = create_client()
structure = client.scan_directory_structure("path/to/directory")
content = client.extract_file_content("path/to/file.py")
analysis = client.analyze_token_usage("path/to/directory")
print(structure)
print(content)
print(analysis)
✨ 主要特性
- 目录结构扫描:以 token 有效的方式扫描并返回目录结构。
- 文件内容提取:提取特定文件的内容,考虑 token 限制和格式。
- 文件搜索:根据指定模式在目录结构中搜索文件。
- token 使用分析:分析目录或文件的 token 使用情况,估算 LLM 处理需求。
- 生成报告:生成包含目录统计信息的综合 Markdown 报告。
- 文件复制:将文件从源路径复制到目标路径。
📦 安装指南
通过 pip 安装
pip install token-scope
克隆仓库并安装
git clone https://github.com/yourusername/token-scope.git
cd token-scope
python -m pip install .
💻 使用示例
基础用法
from tokenscope import create_client
client = create_client()
structure = client.scan_directory_structure("path/to/directory")
content = client.extract_file_content("path/to/file.py")
analysis = client.analyze_token_usage("path/to/directory")
print(structure)
print(content)
print(analysis)
高级用法
以下展示了使用 TokenScope 各个功能的代码示例:
scan_directory_structure(
path: str,
depth: int = 3,
max_tokens: int = 10000,
ignore_patterns: list[str] | None = None,
include_gitignore: bool = True,
include_default_ignores: bool = True
)
extract_file_content(
file_path: str,
max_tokens: int = 10000,
sample_only: bool = False
)
search_files_by_pattern(
directory: str,
patterns: list[str],
max_depth: int = 5,
include_content: bool = False,
max_files: int = 100,
max_tokens_per_file: int = 1000,
sample_only: bool = False,
ignore_patterns: list[str] | None = None,
include_gitignore: bool = True,
include_default_ignores: bool = True
)
analyze_token_usage(
path: str,
include_file_details: bool = False,
ignore_patterns: list[str] | None = None,
include_gitignore: bool = True,
include_default_ignores: bool = True
)
generate_directory_report(
directory: str,
depth: int = 3,
include_file_content: bool = True,
max_files_with_content: int = 5,
max_tokens_per_file: int = 1000,
sample_only: bool = False,
ignore_patterns: list[str] | None = None,
include_gitignore: bool = True,
include_default_ignores: bool = True
)
copy_file_to_destination(
source_path: str,
destination_path: str
)
📚 详细文档
scan_directory_structure
扫描目录并返回其结构,考虑 token 限制。
scan_directory_structure(
path: str,
depth: int = 3,
max_tokens: int = 10000,
ignore_patterns: list[str] | None = None,
include_gitignore: bool = True,
include_default_ignores: bool = True
)
extract_file_content
提取特定文件的内容,考虑 token 限制和格式。
extract_file_content(
file_path: str,
max_tokens: int = 10000,
sample_only: bool = False
)
search_files_by_pattern
根据指定模式搜索目录中的文件。
search_files_by_pattern(
directory: str,
patterns: list[str],
max_depth: int = 5,
include_content: bool = False,
max_files: int = 100,
max_tokens_per_file: int = 1000,
sample_only: bool = False,
ignore_patterns: list[str] | None = None,
include_gitignore: bool = True,
include_default_ignores: bool = True
)
analyze_token_usage
分析指定路径的 token 使用情况。
analyze_token_usage(
path: str,
include_file_details: bool = False,
ignore_patterns: list[str] | None = None,
include_gitignore: bool = True,
include_default_ignores: bool = True
)
generate_directory_report
生成包含统计信息的 Markdown 报告。
generate_directory_report(
directory: str,
depth: int = 3,
include_file_content: bool = True,
max_files_with_content: int = 5,
max_tokens_per_file: int = 1000,
sample_only: bool = False,
ignore_patterns: list[str] | None = None,
include_gitignore: bool = True,
include_default_ignores: bool = True
)
copy_file_to_destination
复制文件到指定路径。
copy_file_to_destination(
source_path: str,
destination_path: str
)
🔧 技术细节
TokenScope 自动忽略以下常见目录和文件:
DEFAULT_IGNORE_PATTERNS = [
".git/",
".venv/",
"venv/",
"__pycache__/",
"node_modules/",
".pytest_cache/",
".ipynb_checkpoints/",
".DS_Store",
"*.pyc",
"*.pyo",
"*.pyd",
"*.so",
"*.dll",
"*.class",
"build/",
"dist/",
"*.egg-info/",
".tox/",
".coverage",
".idea/",
".vscode/",
".mypy_cache/",
]
📄 许可证
本项目 licensed under the MIT License - 详见 LICENSE 文件。
致谢