update template asynch #2

Merged
mschoi merged 25 commits from test_change into main 2025-01-27 23:29:08 +09:00
5 changed files with 192 additions and 72 deletions

View File

@@ -1,5 +1,6 @@
"""Code Reviewer for Gitea."""
Review

[REVIEW] Consider adding a comment explaining why asyncio is being imported and how it's used in the script.

[REVIEW] Consider adding a comment explaining why `asyncio` is being imported and how it's used in the script.
Review

[REVIEW] This diff introduces asynchronous programming using asyncio and aiohttp. This is a significant change from the previous synchronous approach using requests. Make sure that the model classes Model are updated to be compatible with async operation. Verify also that the async model works correctly and does not return any unexpected result.

[REVIEW] This diff introduces asynchronous programming using `asyncio` and `aiohttp`. This is a significant change from the previous synchronous approach using `requests`. Make sure that the model classes `Model` are updated to be compatible with async operation. Verify also that the async model works correctly and does not return any unexpected result.
Review

[REVIEW] It's recommended to group standard library imports and third-party imports separately. Consider moving aiohttp import after the standard library imports.

[REVIEW] It's recommended to group standard library imports and third-party imports separately. Consider moving `aiohttp` import after the standard library imports.
Review

[REVIEW] This line imports the aiohttp library, which is used for making asynchronous HTTP requests. Ensure that aiohttp is correctly installed in the environment where this code will run.

[REVIEW] This line imports the `aiohttp` library, which is used for making asynchronous HTTP requests. Ensure that `aiohttp` is correctly installed in the environment where this code will run.
import asyncio
Review

[REVIEW] The function signature has changed to async def, but the docstring hasn't been updated to reflect this change. Consider updating the docstring to mention that this is now an asynchronous function.

[REVIEW] The function signature has changed to `async def`, but the docstring hasn't been updated to reflect this change. Consider updating the docstring to mention that this is now an asynchronous function.
Review

[REVIEW] The analyze_single_chunks function is now asynchronous. Check if the underlying model method get_response_single_chunk is asynchronous. Also, check if the single chunk model works correctly.

[REVIEW] The `analyze_single_chunks` function is now asynchronous. Check if the underlying model method `get_response_single_chunk` is asynchronous. Also, check if the single chunk model works correctly.
Review

[REVIEW] Consider using a more specific import, like from aiohttp import ClientSession.

[REVIEW] Consider using a more specific import, like `from aiohttp import ClientSession`.
Review

[REVIEW] Consider using f-strings for string formatting instead of concatenation. For example, f"old_line_number:{old_idx} {line}"

[REVIEW] Consider using f-strings for string formatting instead of concatenation. For example, `f"old_line_number:{old_idx} {line}"`
Review

[REVIEW] Consider using f-strings directly for formatting instead of string concatenation.,at line 3

[REVIEW] Consider using f-strings directly for formatting instead of string concatenation.,at line 3
Review

[REVIEW] Consider using f-strings for string formatting for better readability and performance, e.g., f"old_line_number:{old_idx} {line}".

[REVIEW] Consider using f-strings for string formatting for better readability and performance, e.g., `f"old_line_number:{old_idx} {line}"`.
Review

[REVIEW] Consider using f-strings for string formatting for better readability.

[REVIEW] Consider using f-strings for string formatting for better readability.
import fnmatch
Review

[REVIEW] Consider using a more descriptive variable name instead of diff to improve code readability.

[REVIEW] Consider using a more descriptive variable name instead of `diff` to improve code readability.
Review

[REVIEW] Consider using f-strings for string formatting instead of concatenation. For example, f"new_line_number:{new_idx} {line}"

[REVIEW] Consider using f-strings for string formatting instead of concatenation. For example, `f"new_line_number:{new_idx} {line}"`
Review

[REVIEW] Consider using f-strings directly for formatting instead of string concatenation.,at line 4

[REVIEW] Consider using f-strings directly for formatting instead of string concatenation.,at line 4
Review

[REVIEW] Consider using f-strings for string formatting for better readability and performance, e.g., f"new_line_number:{new_idx} {line}".

[REVIEW] Consider using f-strings for string formatting for better readability and performance, e.g., `f"new_line_number:{new_idx} {line}"`.
Review

[REVIEW] Consider using f-strings for string formatting for better readability.

[REVIEW] Consider using f-strings for string formatting for better readability.
Review

[REVIEW] Consider using a type hint for asyncio

[REVIEW] Consider using a type hint for asyncio
Review

[REVIEW] Consider adding a comment explaining the purpose of importing asyncio.

[REVIEW] Consider adding a comment explaining the purpose of importing `asyncio`.
import json
Review

[REVIEW] The process_single_chunk function is defined inside analyze_single_chunks. Consider moving it outside to improve readability and reusability.

[REVIEW] The `process_single_chunk` function is defined inside `analyze_single_chunks`. Consider moving it outside to improve readability and reusability.
Review

[REVIEW] It's redundant to add a comma after the string. The comment body should be a single string.,at line 5

[REVIEW] It's redundant to add a comma after the string. The comment body should be a single string.,at line 5
import os
Review

[REVIEW] The get_response_single_chunk method is now being awaited, but it's not clear if this method has been modified to be a coroutine. Ensure that the Model class has been updated accordingly.

[REVIEW] The `get_response_single_chunk` method is now being awaited, but it's not clear if this method has been modified to be a coroutine. Ensure that the `Model` class has been updated accordingly.
Review

[REVIEW] The get_response_single_chunk call is now awaited, which is correct for async. Ensure that the method returns a coroutine.

[REVIEW] The `get_response_single_chunk` call is now awaited, which is correct for async. Ensure that the method returns a coroutine.
Review

[REVIEW] The get_response_single_chunk method should be awaitable, the function definition async def and call await should be used.

[REVIEW] The `get_response_single_chunk` method should be awaitable, the function definition `async def` and call `await` should be used.
@@ -7,6 +8,7 @@ import re
from typing import Any
Review

[REVIEW] The title and description are not used inside the process_single_chunk function; they can be moved into the function scope to avoid capturing the EVENT_DATA.

[REVIEW] The `title` and `description` are not used inside the `process_single_chunk` function; they can be moved into the function scope to avoid capturing the `EVENT_DATA`.
Review

[REVIEW] It is not necessary to await when calling single_chunk_model.get_response_single_chunk if it is a synchronous function. If it is intended to be asynchronous, then the model class should implement asynchronous methods.

[REVIEW] It is not necessary to await when calling `single_chunk_model.get_response_single_chunk` if it is a synchronous function. If it is intended to be asynchronous, then the model class should implement asynchronous methods.
Review

[REVIEW] It is not necessary to strip the character and the 'json' prefix when usingjson.loads`. Also, it is not recommended to have default value '[]' to an empty string. The code should handle the case where the response string is empty or None.

[REVIEW] It is not necessary to strip the ` character and the 'json' prefix when using `json.loads`. Also, it is not recommended to have default value '[]' to an empty string. The code should handle the case where the response string is empty or None.
Review

[REVIEW] Returning a value in the process_single_chunk function early will change the logic.

[REVIEW] Returning a value in the `process_single_chunk` function early will change the logic.
import requests
Review

[REVIEW] Returning an empty list in the process_single_chunk function early will change the logic.

[REVIEW] Returning an empty list in the `process_single_chunk` function early will change the logic.
import aiohttp
Review

[REVIEW] The single chunks are now being processed concurrently using asyncio.gather. This is good for performance, but ensure there is no race condition or unexpected behaviour. In particular, double check that the model call can be done concurrently.

[REVIEW] The single chunks are now being processed concurrently using `asyncio.gather`. This is good for performance, but ensure there is no race condition or unexpected behaviour. In particular, double check that the model call can be done concurrently.
Review

[REVIEW] Consider adding a type hint for response_json for better code readability.

[REVIEW] Consider adding a type hint for `response_json` for better code readability.
from model import Model
Review

[REVIEW] Consider using a type hint for aiohttp

[REVIEW] Consider using a type hint for aiohttp
Review

[REVIEW] Consider adding a comment explaining the purpose of importing aiohttp.

[REVIEW] Consider adding a comment explaining the purpose of importing `aiohttp`.
ACCESS_TOKEN = os.getenv("ACCESS_TOKEN", "")
@@ -57,8 +59,8 @@ def parse_diff(diff: str) -> list[dict[str, Any]]:
r"(?s)diff --git a/(.+?) b/(.*?)\r?\n(.*?)(?=diff --git a/|$)", re.S
)
old_new_pattern = re.compile(r"(?m)^(---|\+\+\+)\s+(.*)$")
hunk_pattern = re.compile(
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@(.*?)(?=^@@ |$)",
chunk_range_pattern = re.compile(
r"@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@(.*?)?(?=@@|\Z)",
re.MULTILINE | re.DOTALL,
Review

[REVIEW] The regex (?=@@ -\d+(?:,\d+)? \+\d+(?:,\d+)? @@|\Z) could be simplified.

[REVIEW] The regex `(?=@@ -\d+(?:,\d+)? \+\d+(?:,\d+)? @@|\Z)` could be simplified.
)
list_diff = []
@@ -77,33 +79,31 @@ def parse_diff(diff: str) -> list[dict[str, Any]]:
print("Neglict deleted file")
continue
new_file = new_file.lstrip("b/")
hunk_match = hunk_pattern.search(diff_text)
if hunk_match is None:
continue
old_idx = int(hunk_match.group(1))
new_idx = int(hunk_match.group(3))
remain_text = diff_text[hunk_match.end() + 1 :]
diff_text = []
for line in remain_text.splitlines():
if line.startswith("-"):
diff_text.append(f"{old_idx} {line}")
old_idx += 1
elif line.startswith("+"):
diff_text.append(f"{new_idx} {line}")
new_idx += 1
else:
diff_text.append(line)
diff_text = "\n".join(diff_text)
if any(fnmatch.fnmatch(new_file, pattern) for pattern in EXCLUDE_PATTERNS):
print(f"Exclude file {new_file}")
continue
output_diff_text = []
for chunk_range_match in chunk_range_pattern.finditer(diff_text):

[REVIEW] Consider adding a docstring to explain the purpose of this code block.

[REVIEW] Consider adding a docstring to explain the purpose of this code block.
old_idx = int(chunk_range_match.group(1))

[REVIEW] Consider using a more descriptive variable name than hunk_match.

[REVIEW] Consider using a more descriptive variable name than `hunk_match`.
new_idx = int(chunk_range_match.group(3))
for line in chunk_range_match.group(5).splitlines():
if line.startswith("-"):
Review

[REVIEW] It's better to use hunk_match.group(5).splitlines() directly where it's used, instead of assigning it to remain_text.

[REVIEW] It's better to use `hunk_match.group(5).splitlines()` directly where it's used, instead of assigning it to `remain_text`.
output_diff_text.append(f"{old_idx} None {line}")
old_idx += 1
elif line.startswith("+"):
Review

[REVIEW] It would be clearer to use None as string "None" to avoid confusions.

[REVIEW] It would be clearer to use `None` as string `"None"` to avoid confusions.
output_diff_text.append(f"None {new_idx} {line}")
new_idx += 1
else:
Review

[REVIEW] It would be clearer to use None as string "None" to avoid confusions.

[REVIEW] It would be clearer to use `None` as string `"None"` to avoid confusions.
output_diff_text.append(f"{old_idx} {new_idx} {line}")
old_idx += 1
new_idx += 1
Review

[REVIEW] It might be more efficient to use f-strings directly for formatting, like f'{old_idx} {new_idx} {line}'.

[REVIEW] It might be more efficient to use f-strings directly for formatting, like `f'{old_idx} {new_idx} {line}'`.
output_diff_text = "\n".join(output_diff_text)
list_diff.append(
{
"file": new_file,
"chunk": diff_text,
"chunk": output_diff_text,
}
)
return list_diff
@@ -133,7 +133,7 @@ def create_comment(
return comments
def analyze_single_chunks(
async def analyze_single_chunks(
single_chunk_model: Model, parsed_diff: list[dict[str, Any]]

[REVIEW] This function should be an async function.

[REVIEW] This function should be an async function.
) -> list[dict[str, Any]]:
Review

[REVIEW] The function name analyze_single_chunks should be renamed to reflect the async nature of the function, such as async_analyze_single_chunks.

[REVIEW] The function name `analyze_single_chunks` should be renamed to reflect the async nature of the function, such as `async_analyze_single_chunks`.
"""Analyze single chunks and create comments.
@@ -145,29 +145,33 @@ def analyze_single_chunks(
Returns:
list[dict[str, Any]]: comments for single chunk review
"""
comments = []
title = EVENT_DATA["pull_request"]["title"]
description = EVENT_DATA["pull_request"]["body"]
for diff in parsed_diff:
async def process_single_chunk(diff: dict[str, Any]):
file = diff["file"]
chunk = diff["chunk"]
response = single_chunk_model.get_response_single_chunk(
response = await single_chunk_model.get_response_single_chunk(
file, title, description, chunk
)
Review

[REVIEW] Consider using a more descriptive variable name than response.

[REVIEW] Consider using a more descriptive variable name than `response`.
response = response.strip("`").lstrip("json").strip() or "[]"
Review

[REVIEW] This line does not need to await.

[REVIEW] This line does not need to await.
try:
response_json = json.loads(response)
new_comments = create_comment(file, response_json)
comments.extend(new_comments)
return create_comment(file, response_json)
except json.JSONDecodeError:
print(f"Failed to parse response: {response}")
continue
return []
title = EVENT_DATA["pull_request"]["title"]
description = EVENT_DATA["pull_request"]["body"]
tasks = [process_single_chunk(diff) for diff in parsed_diff]
results = await asyncio.gather(*tasks)
Review

[REVIEW] It might be beneficial to add error handling around the asyncio.gather call in case any of the tasks fail.

[REVIEW] It might be beneficial to add error handling around the `asyncio.gather` call in case any of the tasks fail.
# Flatten the list of comments
comments = [comment for result in results for comment in result]
return comments

[REVIEW] This function should be an async function.

[REVIEW] This function should be an async function.
def get_file_content(file: str) -> str | None:
async def get_file_content(file: str) -> str | None:
"""Get file content from Gitea.
Review

[REVIEW] The function name get_file_content should be renamed to reflect the async nature of the function, such as async_get_file_content.

[REVIEW] The function name `get_file_content` should be renamed to reflect the async nature of the function, such as `async_get_file_content`.
Review

[REVIEW] Consider using a more specific exception handling for the aiohttp.ClientError.

[REVIEW] Consider using a more specific exception handling for the `aiohttp.ClientError`.
Args:
@@ -183,15 +187,18 @@ def get_file_content(file: str) -> str | None:
url = f"{repo_url}/raw/{branch}%2F{replaced_file}?ref={branch}"
try:
response = requests.get(url, headers=HEADERS)
response.raise_for_status()
return response.text
except requests.RequestException as e:
print(f"Failed to get file content: {e}")
return None
async with aiohttp.ClientSession(headers=HEADERS) as session:
async with session.get(url) as response:
response.raise_for_status()
Review

[REVIEW] The aiohttp.ClientSession should be created outside the function or passed as an argument for better performance.

[REVIEW] The `aiohttp.ClientSession` should be created outside the function or passed as an argument for better performance.
return await response.text()
except aiohttp.ClientError as e: # More specific exception handling
print(f"Network error fetching {file}: {e}")
except asyncio.TimeoutError:
Review

[REVIEW] Handle more specific exception aiohttp.ClientError instead of the generic exception.

[REVIEW] Handle more specific exception `aiohttp.ClientError` instead of the generic exception.
print(f"Timeout fetching {file}")
return None
def analyze_full_context(
async def analyze_full_context(
full_context_model: Model, parsed_diff: list[dict[str, Any]]
) -> str:
Review

[REVIEW] The function name analyze_full_context should be renamed to reflect the async nature of the function, such as async_analyze_full_context.

[REVIEW] The function name `analyze_full_context` should be renamed to reflect the async nature of the function, such as `async_analyze_full_context`.
Review

[REVIEW] This function should be an async function.

[REVIEW] This function should be an async function.
"""Analyze full context and create review.
@@ -203,22 +210,26 @@ def analyze_full_context(
Returns:
str: review for full context
"""
file_contents = []
for diff in parsed_diff:
async def get_file_data(diff: dict[str, Any]):
file = diff["file"]
chunk = diff["chunk"]
content = get_file_content(file)
if content is None:
continue
file_contents.append(f"File: {file}")
file_contents.append(content)
file_contents.append(f"Diff: {chunk}")
return None

[REVIEW] This line does not need to await.

[REVIEW] This line does not need to await.
return f"File: {file}\n{content}\nDiff: {chunk}"
tasks = [get_file_data(diff) for diff in parsed_diff]
file_contents_list = await asyncio.gather(*tasks)
Review

[REVIEW] It might be beneficial to add error handling around the asyncio.gather call in case any of the tasks fail.

[REVIEW] It might be beneficial to add error handling around the `asyncio.gather` call in case any of the tasks fail.
file_contents = [item for item in file_contents_list if item is not None]
if not file_contents:
return ""
title = EVENT_DATA["pull_request"]["title"]
description = EVENT_DATA["pull_request"]["body"]
response = full_context_model.get_response_full_context(
response = await full_context_model.get_response_full_context(
title, description, file_contents
)
Review

[REVIEW] This line does not need to await.

[REVIEW] This line does not need to await.
response = response.strip("`").lstrip("markdown").strip()
@@ -248,10 +259,10 @@ def post_review(
response.raise_for_status()
def main() -> None:
"""Code Reviewer for Gitea."""
async def main() -> None:
"""Code Reviewer for Gitea: Asynchronous version."""
if EVENT_DATA["action"] not in ["opened", "synchronized"]:

[REVIEW] Consider adding a more descriptive name like async_main instead of just main to better reflect its asynchronous nature.

[REVIEW] Consider adding a more descriptive name like `async_main` instead of just `main` to better reflect its asynchronous nature.

[REVIEW] Consider adding a more specific description for asynchronous main function.

[REVIEW] Consider adding a more specific description for asynchronous main function.
print("Unsupproted event.")
print("Unsupported event.")
return

[REVIEW] Typo: Unsupproted should be Unsupported.

[REVIEW] Typo: `Unsupproted` should be `Unsupported`.
diff = get_diff()
@@ -273,10 +284,21 @@ def main() -> None:
)
parsed_diff = parse_diff(diff)
comments = analyze_single_chunks(single_chunk_model, parsed_diff)
full_context_response = analyze_full_context(full_context_model, parsed_diff)
comments_task = asyncio.create_task(
analyze_single_chunks(single_chunk_model, parsed_diff)

[REVIEW] Consider removing the print statements before production.

[REVIEW] Consider removing the print statements before production.
)
if EVENT_DATA["action"] == "opened":
full_context_response_task = asyncio.create_task(
analyze_full_context(full_context_model, parsed_diff)
)
full_context_response = await full_context_response_task
else:
full_context_response = ""
comments = await comments_task
post_review(full_context_response, comments)
if __name__ == "__main__":

[REVIEW] Consider a more explicit way to call the async main function.

[REVIEW] Consider a more explicit way to call the async main function.
main()
asyncio.run(main())

View File

@@ -4,8 +4,16 @@ from enum import Enum
from typing import Any
Review

[REVIEW] The import statements for anthropic and openai have been updated to use their async versions. This is consistent with the pull request title to update the template to be async.

[REVIEW] The import statements for anthropic and openai have been updated to use their async versions. This is consistent with the pull request title to update the template to be async.
Review

[REVIEW] Consider renaming typing_extensions to typing to avoid confusion since it's used as typing.

[REVIEW] Consider renaming `typing_extensions` to `typing` to avoid confusion since it's used as `typing`.
import google.generativeai as genai
Review

[REVIEW] A new import for typing_extensions has been added and aliased as typing. This is likely for type hinting the Google response and related models.

[REVIEW] A new import for typing_extensions has been added and aliased as typing. This is likely for type hinting the Google response and related models.
from anthropic import Anthropic
from openai import OpenAI
import typing_extensions as typing
from anthropic import AsyncAnthropic
from openai import AsyncOpenAI

[REVIEW] The request method has been changed to be asynchronous, but there's no corresponding change in the method signature to use async def. Consider updating it to async def request(self, prompt: str) -> str:.

[REVIEW] The `request` method has been changed to be asynchronous, but there's no corresponding change in the method signature to use `async def`. Consider updating it to `async def request(self, prompt: str) -> str:`.

[REVIEW] A new type definition for GoogleReponse has been added as a TypedDict with lineNumber as int and reviewComment as string.

[REVIEW] A new type definition for GoogleReponse has been added as a TypedDict with `lineNumber` as int and `reviewComment` as string.
Review

[REVIEW] Consider renaming GoogleReponse to GoogleResponse to adhere to standard naming conventions.,at line 9

[REVIEW] Consider renaming `GoogleReponse` to `GoogleResponse` to adhere to standard naming conventions.,at line 9
Review

[REVIEW] It's recommended to use typing.TypedDict instead of typing_extensions.TypedDict if you're targeting Python 3.8+.

[REVIEW] It's recommended to use `typing.TypedDict` instead of `typing_extensions.TypedDict` if you're targeting Python 3.8+.
Review

[REVIEW] Consider renaming GoogleReponse to GoogleResponse for better readability and consistency.

[REVIEW] Consider renaming `GoogleReponse` to `GoogleResponse` for better readability and consistency.
class GoogleReponse(typing.TypedDict):
"""The response from Google model."""

[REVIEW] The get_response_single_chunk method has been changed to be asynchronous, but there's no await keyword used when calling self.request(prompt). Consider updating it to return await self.request(prompt).

[REVIEW] The `get_response_single_chunk` method has been changed to be asynchronous, but there's no `await` keyword used when calling `self.request(prompt)`. Consider updating it to `return await self.request(prompt)`.
Review

[REVIEW] Consider renaming GoogleReponse to GoogleResponse for consistency.

[REVIEW] Consider renaming `GoogleReponse` to `GoogleResponse` for consistency.
Review

[REVIEW] Consider using a more descriptive name than GoogleReponse.

[REVIEW] Consider using a more descriptive name than `GoogleReponse`.
Review

[REVIEW] Consider renaming GoogleReponse to GoogleResponse.

[REVIEW] Consider renaming `GoogleReponse` to `GoogleResponse`.
Review

[REVIEW] The OpenAI client is now initialized with the async version AsyncOpenAI.

[REVIEW] The OpenAI client is now initialized with the async version AsyncOpenAI.
lineNumber: int

[REVIEW] The get_response_full_context method has been changed to be asynchronous, but there's no await keyword used when calling self.request(prompt). Consider updating it to return await self.request(prompt).

[REVIEW] The `get_response_full_context` method has been changed to be asynchronous, but there's no `await` keyword used when calling `self.request(prompt)`. Consider updating it to `return await self.request(prompt)`.
Review

[REVIEW] The Anthropic client is now initialized with the async version AsyncAnthropic.

[REVIEW] The Anthropic client is now initialized with the async version AsyncAnthropic.
reviewComment: str
Review

[REVIEW] The google generative model is now initilized without api key argument.

[REVIEW] The google generative model is now initilized without api key argument.
Review

[REVIEW] The api_key is redundant here since it is already configured in genai.configure.

[REVIEW] The `api_key` is redundant here since it is already configured in `genai.configure`.
Review

[REVIEW] Consider removing the api_key argument here since it is already configured in genai.configure(api_key=api_key)

[REVIEW] Consider removing the `api_key` argument here since it is already configured in `genai.configure(api_key=api_key)`
Review

[REVIEW] The api_key argument is redundant here since it was already used in the genai.configure call.

[REVIEW] The `api_key` argument is redundant here since it was already used in the `genai.configure` call.
Review

[REVIEW] The type annotation for lineNumber should be int, not str.

[REVIEW] The type annotation for `lineNumber` should be `int`, not `str`.
Review

[REVIEW] Consider using typing.NotRequired to indicate that lineNumber is optional since the response may not always contain it.

[REVIEW] Consider using `typing.NotRequired` to indicate that `lineNumber` is optional since the response may not always contain it.
Review

[REVIEW] It's unusual to have a field named lineNumber in a type definition that describes an API response. It might be more appropriate to use a different name or rethink this structure.

[REVIEW] It's unusual to have a field named `lineNumber` in a type definition that describes an API response. It might be more appropriate to use a different name or rethink this structure.
Review

[REVIEW] The Deepseek client is now initialized with the async version AsyncOpenAI.

[REVIEW] The Deepseek client is now initialized with the async version AsyncOpenAI.
Review

[REVIEW] The system_instruction parameter is not a valid parameter for genai.GenerativeModel. It should be passed to the generate_content method.

[REVIEW] The `system_instruction` parameter is not a valid parameter for `genai.GenerativeModel`. It should be passed to the `generate_content` method.
Review

[REVIEW] The type annotation for reviewComment should be str, not int.

[REVIEW] The type annotation for `reviewComment` should be `str`, not `int`.
Review

[REVIEW] Consider using typing.NotRequired to indicate that reviewComment is optional since the response may not always contain it.

[REVIEW] Consider using `typing.NotRequired` to indicate that `reviewComment` is optional since the response may not always contain it.
Review

[REVIEW] It's unusual to have a field named reviewComment in a type definition that describes an API response. It might be more appropriate to use a different name or rethink this structure.

[REVIEW] It's unusual to have a field named `reviewComment` in a type definition that describes an API response. It might be more appropriate to use a different name or rethink this structure.
Review

[REVIEW] The request method is now async.

[REVIEW] The request method is now async.
class ModelProvider(Enum):
Review

[REVIEW] The openai chat completion is now called with await.

[REVIEW] The openai chat completion is now called with await.
@@ -79,16 +87,18 @@ class Model:
"""
match self.provider:
Review

[REVIEW] The api_key parameter is redundant, as the key is configured globally.

[REVIEW] The `api_key` parameter is redundant, as the key is configured globally.
case ModelProvider.OPENAI:
return OpenAI(api_key=api_key)
return AsyncOpenAI(api_key=api_key)
Review

[REVIEW] Consider adding a comment explaining the change from synchronous to asynchronous client initialization.

[REVIEW] Consider adding a comment explaining the change from synchronous to asynchronous client initialization.
case ModelProvider.ANTHROPIC:
return Anthropic(api_key=api_key)
return AsyncAnthropic(api_key=api_key)
Review

[REVIEW] Since the request method is now asynchronous, its name should also reflect this change (e.g., async_request).

[REVIEW] Since the `request` method is now asynchronous, its name should also reflect this change (e.g., `async_request`).
case ModelProvider.GOOGLE:
genai.configure(api_key=api_key)
return genai.GenerativeModel(model=self.model, api_key=api_key)
return genai.GenerativeModel(
model_name=self.model, system_instruction=self.system_prompt
)
Review

[REVIEW] The api_key parameter is not needed here since it is already configured globally.

[REVIEW] The `api_key` parameter is not needed here since it is already configured globally.
Review

[REVIEW] The api_key is already configured in genai.configure(api_key=api_key), so it should not be passed here.

[REVIEW] The `api_key` is already configured in `genai.configure(api_key=api_key)`, so it should not be passed here.
Review

[REVIEW] The api_key parameter is not required, remove it from function call.

[REVIEW] The `api_key` parameter is not required, remove it from function call.
case ModelProvider.DEEPSEEK:
return OpenAI(api_key=api_key, base_url="https://api.deepseek.com")
return AsyncOpenAI(api_key=api_key, base_url="https://api.deepseek.com")
def request(self, prompt: str) -> str:
async def request(self, prompt: str) -> str:
"""Request the model to generate a response.
Review

[REVIEW] Consider adding a comment explaining the change from synchronous to asynchronous call.

[REVIEW] Consider adding a comment explaining the change from synchronous to asynchronous call.
Args:
@@ -99,7 +109,7 @@ class Model:
"""
match self.provider:
case ModelProvider.OPENAI | ModelProvider.DEEPSEEK:
response = self.session.chat.completions.create(
response = await self.session.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": self.system_prompt},
@@ -113,7 +123,7 @@ class Model:
)
return response.choices[0].message.content.strip()
case ModelProvider.ANTHROPIC:
response = self.session.messages.create(
response = await self.session.messages.create(
model=self.model,
messages=[{"role": "user", "content": prompt}],
system=[
@@ -128,10 +138,16 @@ class Model:
)
return response.content[0].text.strip()
case ModelProvider.GOOGLE:
response = self.session.generate_content(prompt)
response = await self.session.generate_content_async(
prompt,
Review

[REVIEW] It is recommended to set response_mime_type and response_schema outside the function so that the arguments can be customized.

[REVIEW] It is recommended to set `response_mime_type` and `response_schema` outside the function so that the arguments can be customized.
generation_config=genai.GenerationConfig(
response_mime_type="application/json",
response_schema=list[GoogleReponse],
Review

[REVIEW] The response_mime_type is not necessary since the response type is handled by response_schema.

[REVIEW] The `response_mime_type` is not necessary since the response type is handled by `response_schema`.
),
Review

[REVIEW] The response_schema should be GoogleReponse instead of list[GoogleReponse] since the model is not expected to return a list.

[REVIEW] The `response_schema` should be `GoogleReponse` instead of `list[GoogleReponse]` since the model is not expected to return a list.
Review

[REVIEW] The response_schema should be a type not a list of type.

[REVIEW] The `response_schema` should be a type not a list of type.
Review

[REVIEW] It is recommended to set response_mime_type and response_schema outside the function so that the arguments can be customized.

[REVIEW] It is recommended to set `response_mime_type` and `response_schema` outside the function so that the arguments can be customized.
)
return response.text.strip()
def get_response_single_chunk(
async def get_response_single_chunk(
Review

[REVIEW] Consider adding a comment explaining the change from synchronous to asynchronous call.

[REVIEW] Consider adding a comment explaining the change from synchronous to asynchronous call.
self, file: str, title: str, description: str, chunk: str
) -> str:
Review

[REVIEW] Since the get_response_full_context method is now asynchronous, its name should also reflect this change (e.g., async_get_response_full_context).

[REVIEW] Since the `get_response_full_context` method is now asynchronous, its name should also reflect this change (e.g., `async_get_response_full_context`).
"""Get the response for a single chunk.
@@ -146,9 +162,9 @@ class Model:
str: The response.
"""
prompt = SINGLE_CHUNK_USER_PROMPT.format(file, title, description, chunk)
return self.request(prompt)
return await self.request(prompt)
def get_response_full_context(
async def get_response_full_context(
self, title: str, description: str, file_contents: list[str]
) -> str:
Review

[REVIEW] Consider adding a comment explaining the change from synchronous to asynchronous call.

[REVIEW] Consider adding a comment explaining the change from synchronous to asynchronous call.
"""Get the response for full context.
@@ -165,7 +181,7 @@ class Model:
prompt = FULL_CONTEXT_USER_PROMPT.format(
title, description, "\n".join(file_contents)
)
return self.request(prompt)
return await self.request(prompt)
except Exception as e:
print(f"Error during full context response: {e}")
print(prompt)
@@ -175,14 +191,21 @@ class Model:
SINGLE_CHUNK_SYSTEM_PROMPT = (
"Your task is to review pull requests. Instructions:\n"
"- Provide the response in the following JSON format: "
"""[{{"lineNumber": <line_number>, "reviewComment": "<review comment>"}}] \n"""
"""[{{"lineNumber": int, "reviewComment": str}}] \n"""
Review

[REVIEW] The json format should include double quotes for keys and values. For example, {"lineNumber": int, "reviewComment": str}.

[REVIEW] The json format should include double quotes for keys and values. For example, `{"lineNumber": int, "reviewComment": str}`.
"- lineNumber is about the line number of the code that in new file. \n"
Review

[REVIEW] Consider to explicitly specify the format for the numbers instead of using int and str

[REVIEW] Consider to explicitly specify the format for the numbers instead of using `int` and `str`
Review

[REVIEW] The format of JSON should be clarified, like [{"lineNumber": int, "reviewComment": str}]

[REVIEW] The format of JSON should be clarified, like `[{"lineNumber": int, "reviewComment": str}]`
"- lineNumber can be found at the front of each line. \n"
"- At the first number is old line number, the second number is new line number. \n"
Review

[REVIEW] Consider adding a blank line after this line for better readability

[REVIEW] Consider adding a blank line after this line for better readability
"- If the line starts with `+`, it means the line is added. \n"
"- If the line starts with `-`, it means the line is deleted. \n"
"- Evaluate whether the code changes and additions are appropriate "
"and if the new code structure is suitable. \n"
"- Do not give positive comments or compliments. \n"
"- Provide comments and suggestions ONLY if there is something to improve"
"otherwise return an empty array. \n"
"- Write the comment in GitHub Markdown format. \n"
"- Use the given description only for the overall context "
"and only comment the code. \n"
"- Do not suggest type hint or naming convention. \n"
"- IMPORTANT: NEVER suggest adding comments to the code. \n"
)
SINGLE_CHUNK_USER_PROMPT = (

View File

@@ -21,15 +21,15 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install requests py-gitea openai anthropic google-generativeai
pip install aiohttp requests py-gitea openai anthropic google-generativeai
- name: Run Code Review
env:
ACCESS_TOKEN: ${{ secrets.ACCESS_TOKEN }}
FULL_CONTEXT_MODEL: gpt-4o
FULL_CONTEXT_API_KEY: ${{ secrets.OPENAI_API_KEY }}
SINGLE_CHUNK_MODEL: gpt-4o
SINGLE_CHUNK_API_KEY: ${{ secrets.OPENAI_API_KEY }}
FULL_CONTEXT_MODEL: deepseek-reasoner
FULL_CONTEXT_API_KEY: ${{ secrets.DEEPSEEK_API_KEY }}
SINGLE_CHUNK_MODEL: gemini-2.0-flash-exp
SINGLE_CHUNK_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
EXCLUDE: "*.yml,*.yaml"
run: python .gitea/scripts/code_review.py

View File

@@ -0,0 +1,30 @@
name: mint_ci
on:
push:
branches: [ "main"]
pull_request:
types: [unlabeled, opened, synchronize, reopened]
env:
PYTHON_VERSION: "3.12.3"
jobs:
lint:
name: Check Python code using ruff
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
token: ${{ secrets.ACCESS_TOKEN }}
- uses: astral-sh/ruff-action@v1
with:
version: 0.7.4
args: check . --select=E5,F4,F8,D400,D403,D417,D100,D102,D103,D101,ANN001,ANN201 --output-format=full --exclude='**/test_*.py','**/__init__.py' --force-exclude
changed-files: 'true'
- uses: astral-sh/ruff-action@v1
with:
version: 0.7.4
args: check . --select=I --output-format=full --force-exclude
changed-files: 'true'
continue-on-error: true

View File

@@ -0,0 +1,45 @@
on:
push:
branches: [main]
pull_request:
types: [unlabeled, opened, synchronize, reopened]
merge_group:
name: Rust-lint
jobs:
rust_tests:
env:
RUST_BACKTRACE: full
name: Run rust tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: clippy
- uses: Swatinem/rust-cache@v2
- name: run clippy
run: cargo clippy --workspace --exclude rustpython_wasm -- -Dwarnings
- name: run rust tests
run: cargo test --workspace --verbose
- name: check compilation without threading
run: cargo check
lint:
name: Check Rust code with rustfmt and clippy
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: rustfmt, clippy
- name: run rustfmt
run: cargo fmt --check