DroidCon 2024 - AI Pull Request reviewer using ChatGPT and GitHub Actions

학습일지/AI

DroidCon 2024 - AI Pull Request reviewer using ChatGPT and GitHub Actions

inspirit941 2024. 3. 21. 23:54

https://www.droidcon.com/2024/03/14/automate-pull-request-reviews-using-chatgpt-and-github-actions/?ref=dailydev

https://youtu.be/t9hleFcIWQ8?si=eWwzMBgHdcRAd5FG

스크린샷 2024-03-21 오전 9 56 23

인터넷 돌아다니다가 찾은 영상인데, 재미있어 보여서 정리함.

Android Codebase에 rookie수준의 mistake를 만들고 나서 code review를 받아보는 형태로 시연.

Repository: https://github.com/Nerdy-Things/chat-gpt-pr-reviewer

GitHub - Nerdy-Things/chat-gpt-pr-reviewer

Contribute to Nerdy-Things/chat-gpt-pr-reviewer development by creating an account on GitHub.

github.com

리뷰결과 비교는 영상이나 repo에 이미 적혀 있으므로 패스. Github Actions를 어떻게 만들었는지 체크한다

# Apache License
# Version 2.0, January 2004
# Author: Eugene Tkachenko

name: Pull Request ChatGPT review

on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  ai_pr_reviewer:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: '3.x'

      - name: Install dependencies
        run: pip install -r .ai/io/nerdythings/requirements.txt 
        # requirements 가보면 requests, openai 딱 두 개 있다.

      - name: Run Reviewer Script
        env:
          GITHUB_HEAD_REF: ${{ github.head_ref }}
          GITHUB_BASE_REF: ${{ github.base_ref }}
          CHATGPT_KEY: ${{ secrets.CHATGPT_KEY }}
          CHATGPT_MODEL: ${{ secrets.CHATGPT_MODEL }}
          GITHUB_TOKEN: ${{ secrets.API_KEY }}
          TARGET_EXTENSIONS: ${{ vars.TARGET_EXTENSIONS }}
          REPO_OWNER: ${{ github.repository_owner }}
          REPO_NAME: ${{ github.event.repository.name }}
          PULL_NUMBER: ${{ github.event.number }}
        run: |
          python .ai/io/nerdythings/github_reviewer.py

      - name: Upload result as an artifact
        uses: actions/upload-artifact@v4
        with:
          name: AI-requests
          path: output.txt
          retention-days: 1

생각보다 별거 없다. github action 세팅하고, python script 실행하고, 실행결과를 artifact에 저장한다.

그러면 python script가 로직의 핵심이라는 것. 아래 코드를 한번 보자

# Apache License
# Version 2.0, January 2004
# Author: Eugene Tkachenko

import os
from git import Git 
from pathlib import Path
from ai.chat_gpt import ChatGPT
from ai.ai_bot import AiBot
from log import Log
from env_vars import EnvVars
from repository.github import GitHub
from repository.repository import RepositoryError

separator = "\n\n----------------------------------------------------------------------\n\n"
log_file = open('output.txt', 'a')

def main():
    vars = EnvVars()
    vars.check_vars()

    ai = ChatGPT(vars.chat_gpt_token, vars.chat_gpt_model)
    github = GitHub(vars.token, vars.owner, vars.repo, vars.pull_number)

    # git remote -v 실행해서, origin에 해당하는 git 주소를 리턴한다.
    remote_name = Git.get_remote_name()

    Log.print_green("Remote is", remote_name)
    # git diff <origin branch> <target branch> 로 변경사항 체크. list of changed files를 리턴한다.
    changed_files = Git.get_diff_files(remote_name=remote_name, head_ref=vars.head_ref, base_ref=vars.base_ref)
    Log.print_green("Found changes in files", changed_files)
    if len(changed_files) == 0: 
        Log.print_red("No changes between branch")

    for file in changed_files:
        Log.print_green("Checking file", file)

        _, file_extension = os.path.splitext(file)
        file_extension = file_extension.lstrip('.') # extension 기준으로 구분
        if file_extension not in vars.target_extensions:
            Log.print_yellow(f"Skipping, unsuported extension {file_extension} file {file}")
            continue

        try:
            # 파일 정보 읽는다
            with open(file, 'r') as file_opened:
                file_content = file_opened.read()
        except FileNotFoundError:
            Log.print_yellow("File was removed. Continue.", file)
            continue

        if len( file_content ) == 0: 
            Log.print_red("File is empty")
            continue

        # 개별 파일별로 diff 실행. 이 결과를 chatGPT에 전달한다.
        file_diffs = Git.get_diff_in_file(remote_name=remote_name, head_ref=vars.head_ref, base_ref=vars.base_ref, file_path=file)
        if len( file_diffs ) == 0: 
            Log.print_red("Diffs are empty")

        # chatGPT에 query. 관련 prompt는 repo의 ai_bot.py 에서 확인할 수 있다.
        Log.print_green(f"Asking AI. Content Len:{len(file_content)} Diff Len: {len(file_diffs)}")
        response = ai.ai_request_diffs(code=file_content, diffs=file_diffs)

        log_file.write(f"{separator}{file_content}{separator}{file_diffs}{separator}{response}{separator}")

        if AiBot.is_no_issues_text(response):
            Log.print_green("File looks good. Continue", file)
        else:
            responses = AiBot.split_ai_response(response)
            if len(responses) == 0:
                Log.print_red("Responses where not parsed:", responses)

            result = False
            for response in responses:
                if response.line:
                    result = post_line_comment(github=github, file=file, text=response.text, line=response.line)
                if not result:
                    result = post_general_comment(github=github, file=file, text=response.text)
                if not result:
                    raise RepositoryError("Failed to post any comments.")

def post_line_comment(github: GitHub, file: str, text:str, line: int):
    Log.print_green("Posting line", file, line, text)
    try:
        git_response = github.post_comment_to_line(
            text=text, 
            commit_id=Git.get_last_commit_sha(file=file), 
            file_path=file, 
            line=line,
        )
        Log.print_yellow("Posted", git_response)
        return True
    except RepositoryError as e:
        Log.print_red("Failed line comment", e)
        return False

def post_general_comment(github: GitHub, file: str, text:str) -> bool:
    Log.print_green("Posting general", file, text)
    try:
        message = f"{file}\n{text}"
        git_response = github.post_comment_general(message)
        Log.print_yellow("Posted general", git_response)
        return True
    except RepositoryError:
        Log.print_red("Failed general comment")
        return False

if __name__ == "__main__":
    main()

log_file.close()

Prices?

스크린샷 2024-03-21 오전 10 14 55

GPT-4 사용. 하나의 pull request review당 $0.1 (10센트). 이 영상을 위한 테스트로 $5 들었다고 한다.

200자 수준의 적은 변경사항, 모든 commit 대상으로 한 것도 아니었고, json이나 xml 등의 파일은 review하지 않도록 설정했음에도 이 정도 비용.
production 팀 단위로 쓰면 비용 많이 나올 듯.

더 비용이 낮은 GPT-3.5 등의 경우, 응답결과가 썩 좋지는 않았다.

같은 commit에서 GPT-4는 발견한 error를 GPT-3.5는 no flaws로 인식한 비율이 높다.

스크린샷 2024-03-21 오전 10 18 31

같은 commit을 pull request 했을 때도 응답결과가 다름. 어떨 땐 comment가 3개, 어떨 땐 comment가 7개.

스크린샷 2024-03-21 오전 10 22 08

changed full file + git diff 이력을 전부 전달했는데, diff는 chatGPT 응답에 line number 명시하는 용도로밖에 안 쓰였던 거 같다.

diff를 포함하지 않으면 reduce cost 가능할 것 같은데, 테스트해보진 않았다.

유의미한 리뷰도 있었고, False Positive 결과도 있었음. 직접 해보는 것도 좋을 거 같다.

저작자표시 비영리 변경금지

'학습일지 > AI' 카테고리의 다른 글

Fast Intro to image and text Multi-Modal with OpenAI CLIP (0)	2024.03.26
LangChain - Advanced RAG Technique for Better Retrieval Performance 정리 (0)	2024.03.14
SK Tech Summit 2023 - 비즈니스에 실제로 활용 가능한 LLM 서비스 만들기 (1)	2024.02.17
Efficient NLP - Fine-tuning Whisper to learn my Chinese dialect (Teochew) (1)	2024.02.14
SK Tech Summit 2023 - LLM 적용 방법인 PEFT vs RAG, Domain 적용 승자는? (1)	2024.02.06

현재글DroidCon 2024 - AI Pull Request reviewer using ChatGPT and GitHub Actions

관찰과 질문, 그리고 데이터