Skip to content

Staging #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Staging #92

wants to merge 6 commits into from

Conversation

AviAvni
Copy link
Contributor

@AviAvni AviAvni commented May 12, 2025

PR Type

Enhancement


Description

  • Enhanced entity symbol resolution with new Symbol class

  • Added support for storing call line and text in graph relationships

  • Refactored entity relationship connection to accept properties

  • Improved handling of resolved symbols in analyzers and entities


Changes walkthrough 📝

Relevant files
Enhancement
entity.py
Refactor entity symbol handling with new Symbol class       

api/entities/entity.py

  • Introduced a new Symbol class to encapsulate symbol and its
    resolutions
  • Refactored Entity to use Symbol objects instead of raw nodes
  • Updated symbol resolution logic to work with the new Symbol structure
  • Removed the resolved_symbols dictionary in favor of per-symbol storage
  • +11/-11 
    source_analyzer.py
    Update analyzer for new symbol structure and call metadata

    api/analyzers/source_analyzer.py

  • Updated to use new Symbol class for entity symbols
  • Modified call relationship to include line and text properties
  • Changed symbol resolution and connection logic for compatibility
  • +10/-7   
    graph.py
    Allow relationship properties in connect_entities method 

    api/graph.py

  • Modified connect_entities to accept and set relationship properties
  • Updated method signature and Cypher query to handle properties
  • +3/-2     

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • Summary by CodeRabbit

    • New Features

      • Added support for attaching metadata, such as line numbers and decoded text, to call relationships in the graph.
    • Improvements

      • Enhanced internal handling of symbols and their resolved counterparts for more accurate graph connections.
      • Relationships in the graph can now store additional properties, improving flexibility and contextual detail.

    Copy link

    vercel bot commented May 12, 2025

    The latest updates on your projects. Learn more about Vercel for Git ↗︎

    Name Status Preview Comments Updated (UTC)
    code-graph-backend ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 12, 2025 6:44am

    Copy link
    Contributor

    coderabbitai bot commented May 12, 2025

    Walkthrough

    The changes refactor how symbols and their resolved relationships are managed within entities, introducing a new Symbol class to encapsulate both the symbol node and its resolved symbols. The graph connection logic is updated to utilize resolved symbol IDs and now allows setting relationship properties, such as line numbers and code text for call edges.

    Changes

    File(s) Change Summary
    api/entities/entity.py Introduced a Symbol class to wrap a Node and its resolved symbols. Changed Entity.symbols to store lists of Symbol objects. Removed the resolved_symbols attribute from Entity. Updated methods to manage resolved symbols within Symbol instances.
    api/analyzers/source_analyzer.py Updated the second_pass method to iterate over entity.symbols, extract resolved symbols from each Symbol, and use their IDs for graph connections. For "call" relationships, added metadata (line number, code text) to the graph connection.
    api/graph.py Modified connect_entities to accept an optional properties dictionary, allowing properties to be set on the relationship in the graph database. Updated the Cypher query to merge these properties.

    Sequence Diagram(s)

    sequenceDiagram
        participant Analyzer as SourceAnalyzer
        participant Entity
        participant Symbol
        participant Graph
    
        Analyzer->>Entity: Iterate over symbols
        loop For each symbol in entity.symbols
            Entity->>Symbol: Access resolved_symbol set
            alt resolved_symbol set is not empty
                Symbol->>Graph: connect_entities(relation, src_id, resolved_id, properties)
            end
        end
    
    Loading

    Poem

    In fields of code, a symbol hops,
    With friends resolved in clever crops.
    Connections drawn with extra flair—
    Line numbers, text, a graph to share!
    A bunny beams at tidy ties,
    As relationships now harmonize.
    🐇✨

    ✨ Finishing Touches
    • 📝 Generate Docstrings

    Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

    ❤️ Share
    🪧 Tips

    Chat

    There are 3 ways to chat with CodeRabbit:

    • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
      • I pushed a fix in commit <commit_id>, please review it.
      • Generate unit testing code for this file.
      • Open a follow-up GitHub issue for this discussion.
    • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
      • @coderabbitai generate unit testing code for this file.
      • @coderabbitai modularize this function.
    • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
      • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
      • @coderabbitai read src/utils.ts and generate unit testing code.
      • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
      • @coderabbitai help me debug CodeRabbit configuration file.

    Support

    Need help? Create a ticket on our support page for assistance with any issues or questions.

    Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

    CodeRabbit Commands (Invoked using PR comments)

    • @coderabbitai pause to pause the reviews on a PR.
    • @coderabbitai resume to resume the paused reviews.
    • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
    • @coderabbitai full review to do a full review from scratch and review all the files again.
    • @coderabbitai summary to regenerate the summary of the PR.
    • @coderabbitai generate docstrings to generate docstrings for this PR.
    • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
    • @coderabbitai resolve resolve all the CodeRabbit review comments.
    • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
    • @coderabbitai help to get help.

    Other keywords and placeholders

    • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
    • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
    • Add @coderabbitai anywhere in the PR title to generate the title automatically.

    CodeRabbit Configuration File (.coderabbit.yaml)

    • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
    • Please see the configuration documentation for more information.
    • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

    Documentation and Community

    • Visit our Documentation for detailed information on how to use CodeRabbit.
    • Join our Discord Community to get help, request features, and share feedback.
    • Follow us on X/Twitter for updates and announcements.

    Copy link

    Qodo Merge was enabled for this repository. To continue using it, please link your Git account with your Qodo account here.

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Symbol Resolution

    The new Symbol class implementation changes how symbols are resolved and stored. Verify that all existing code that previously accessed resolved_symbols directly has been updated to use the new Symbol.resolved_symbol property.

    class Symbol:
        def __init__(self, symbol: Node):
            self.symbol = symbol
            self.resolved_symbol = set()
    
        def add_resolve_symbol(self, resolved_symbol):
            self.resolved_symbol.add(resolved_symbol)
    Null Check

    The code now checks if symbol.resolved_symbol is empty before proceeding, but doesn't handle the case where multiple resolved symbols might exist. Consider if taking just the first resolved symbol is always the correct approach.

    if len(symbol.resolved_symbol) == 0:
        continue
    resolved_symbol = next(iter(symbol.resolved_symbol))
    Default Parameter

    The connect_entities method now uses an empty dictionary as default parameter. Mutable default parameters can cause unexpected behavior if modified. Consider using None and initializing the dictionary inside the function.

    def connect_entities(self, relation: str, src_id: int, dest_id: int, properties: dict = {}) -> None:
        """

    Copy link

    Qodo Merge was enabled for this repository. To continue using it, please link your Git account with your Qodo account here.

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Add exception handling

    The function doesn't handle potential exceptions from the callback function f.
    If f raises an exception, the entire resolution process will fail. Add exception
    handling to make the code more robust.

    api/entities/entity.py [27-31]

     def resolved_symbol(self, f: Callable[[str, Node], list[Self]]):
         for key, symbols in self.symbols.items():
             for symbol in symbols:
    -            for resolved_symbol in f(key, symbol.symbol):
    -                symbol.add_resolve_symbol(resolved_symbol)
    +            try:
    +                for resolved_symbol in f(key, symbol.symbol):
    +                    symbol.add_resolve_symbol(resolved_symbol)
    +            except Exception as e:
    +                import logging
    +                logging.error(f"Error resolving symbol: {e}")
    • Apply / Chat
    Suggestion importance[1-10]: 6

    __

    Why: The suggestion adds exception handling around the callback function f, which prevents the entire resolution process from failing if a single symbol resolution fails. This is a meaningful improvement to error resilience, especially since external callbacks can be unpredictable.

    Low
    Add error handling

    The code assumes symbol.resolved_symbol is never empty after the check, but
    doesn't handle the case where it could be empty. Add proper error handling to
    avoid potential runtime errors if resolved_symbol is empty.

    api/analyzers/source_analyzer.py [148-150]

    -if len(symbol.resolved_symbol) == 0:
    +if not symbol.resolved_symbol:
         continue
    -resolved_symbol = next(iter(symbol.resolved_symbol))
    +try:
    +    resolved_symbol = next(iter(symbol.resolved_symbol))
    +except StopIteration:
    +    logging.warning(f"Empty resolved symbol set for {key} in {entity.id}")
    +    continue
    • Apply / Chat
    Suggestion importance[1-10]: 5

    __

    Why: The suggestion adds defensive error handling for the case where symbol.resolved_symbol might be empty after the initial check. While the existing code already checks for emptiness, the improved version adds logging and handles potential StopIteration exceptions, which improves robustness.

    Low
    • More

    Copy link

    Qodo Merge was enabled for this repository. To continue using it, please link your Git account with your Qodo account here.

    CI Feedback 🧐

    A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

    Action: build

    Failed stage: Lint with flake8 [❌]

    Failed test name: flake8 syntax check

    Failure summary:

    The action failed due to a Python syntax error detected by Flake8. Specifically, in file
    ./api/llm.py at line 238, there's an unused global variable declaration:


    global ontology


    The error code F824 indicates that global ontology is declared but never assigned a value within its
    scope. This is considered a code quality issue that needs to be fixed.

    Relevant error logs:
    1:  ##[group]Operating System
    2:  Ubuntu
    ...
    
    552:  Successfully built multilspy falkordb graphrag-sdk python-abc ratelimit
    553:  Installing collected packages: wcwidth, ratelimit, python-abc, pure-eval, ptyprocess, zipp, validators, urllib3, typing-extensions, tree-sitter-python, tree-sitter-java, tree-sitter-c, tree-sitter, traitlets, tqdm, tornado, toml, soupsieve, sniffio, six, rpds-py, regex, pyzmq, pyyaml, python-dotenv, pygments, pycparser, psutil, propcache, prompt-toolkit, platformdirs, pexpect, parso, packaging, nest-asyncio, markupsafe, jiter, itsdangerous, idna, h11, fsspec, frozenlist, fix-busted-json, filelock, executing, exceptiongroup, docstring-to-markdown, distro, decorator, debugpy, ct3, click, charset-normalizer, certifi, blinker, backoff, attrs, async-timeout, asttokens, annotated-types, aiohappyeyeballs, werkzeug, stack-data, requests, referencing, redis, python-dateutil, pypdf, pydantic-core, multidict, matplotlib-inline, jupyter-core, jinja2, jedi, javatools, importlib-metadata, httpcore, comm, cffi, cattrs, beautifulsoup4, anyio, aiosignal, yarl, tiktoken, pygit2, pydantic, lsprotocol, jupyter-client, jsonschema-specifications, ipython, huggingface-hub, httpx, flask, falkordb, bs4, tokenizers, pygls, openai, ollama, jsonschema, ipykernel, aiohttp, litellm, jedi-language-server, graphrag-sdk, multilspy
    554:  Attempting uninstall: typing-extensions
    555:  Found existing installation: typing_extensions 4.13.2
    556:  Uninstalling typing_extensions-4.13.2:
    557:  Successfully uninstalled typing_extensions-4.13.2
    558:  Attempting uninstall: packaging
    559:  Found existing installation: packaging 25.0
    560:  Uninstalling packaging-25.0:
    561:  Successfully uninstalled packaging-25.0
    562:  Attempting uninstall: exceptiongroup
    563:  Found existing installation: exceptiongroup 1.3.0
    564:  Uninstalling exceptiongroup-1.3.0:
    565:  Successfully uninstalled exceptiongroup-1.3.0
    566:  Successfully installed aiohappyeyeballs-2.4.4 aiohttp-3.11.11 aiosignal-1.3.2 annotated-types-0.7.0 anyio-4.8.0 asttokens-3.0.0 async-timeout-5.0.1 attrs-25.1.0 backoff-2.2.1 beautifulsoup4-4.12.3 blinker-1.9.0 bs4-0.0.2 cattrs-24.1.2 certifi-2024.12.14 cffi-1.17.1 charset-normalizer-3.4.1 click-8.1.8 comm-0.2.2 ct3-3.4.0 debugpy-1.8.12 decorator-5.1.1 distro-1.9.0 docstring-to-markdown-0.15 exceptiongroup-1.2.2 executing-2.2.0 falkordb-1.0.10 filelock-3.17.0 fix-busted-json-0.0.18 flask-3.1.0 frozenlist-1.5.0 fsspec-2024.12.0 graphrag-sdk-0.5.0 h11-0.14.0 httpcore-1.0.7 httpx-0.27.2 huggingface-hub-0.28.0 idna-3.10 importlib-metadata-8.6.1 ipykernel-6.29.5 ipython-8.31.0 itsdangerous-2.2.0 javatools-1.6.0 jedi-0.19.2 jedi-language-server-0.41.1 jinja2-3.1.5 jiter-0.8.2 jsonschema-4.23.0 jsonschema-specifications-2024.10.1 jupyter-client-8.6.3 jupyter-core-5.7.2 litellm-1.59.9 lsprotocol-2023.0.1 markupsafe-3.0.2 matplotlib-inline-0.1.7 multidict-6.1.0 multilspy-0.0.11 nest-asyncio-1.6.0 ollama-0.2.1 openai-1.60.2 packaging-24.2 parso-0.8.4 pexpect-4.9.0 platformdirs-4.3.6 prompt-toolkit-3.0.50 propcache-0.2.1 psutil-6.1.1 ptyprocess-0.7.0 pure-eval-0.2.3 pycparser-2.22 pydantic-2.10.6 pydantic-core-2.27.2 pygit2-1.17.0 pygls-1.3.1 pygments-2.19.1 pypdf-4.3.1 python-abc-0.2.0 python-dateutil-2.9.0.post0 python-dotenv-1.0.1 pyyaml-6.0.2 pyzmq-26.2.0 ratelimit-2.2.1 redis-5.2.1 referencing-0.36.2 regex-2024.11.6 requests-2.32.3 rpds-py-0.22.3 six-1.17.0 sniffio-1.3.1 soupsieve-2.6 stack-data-0.6.3 tiktoken-0.8.0 tokenizers-0.21.0 toml-0.10.2 tornado-6.4.2 tqdm-4.67.1 traitlets-5.14.3 tree-sitter-0.24.0 tree-sitter-c-0.23.4 tree-sitter-java-0.23.5 tree-sitter-python-0.23.6 typing-extensions-4.12.2 urllib3-2.3.0 validators-0.34.0 wcwidth-0.2.13 werkzeug-3.1.3 yarl-1.18.3 zipp-3.21.0
    567:  ##[group]Run # stop the build if there are Python syntax errors or undefined names
    568:  �[36;1m# stop the build if there are Python syntax errors or undefined names�[0m
    569:  �[36;1mflake8 . --count --select=E9,F63,F7,F82 --show-source --statistics�[0m
    570:  �[36;1m# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide�[0m
    571:  �[36;1m# flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics�[0m
    572:  shell: /usr/bin/bash -e {0}
    573:  env:
    574:  pythonLocation: /opt/hostedtoolcache/Python/3.10.17/x64
    575:  LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.10.17/x64/lib
    576:  ##[endgroup]
    577:  ./api/llm.py:238:5: F824 `global ontology` is unused: name is never assigned in scope
    578:  global ontology
    579:  ^
    580:  1     F824 `global ontology` is unused: name is never assigned in scope
    581:  1
    582:  ##[error]Process completed with exit code 1.
    583:  Post job cleanup.
    584:  [command]/usr/bin/git version
    585:  git version 2.49.0
    586:  Temporarily overriding HOME='/home/runner/work/_temp/1d234e83-3bd4-427d-83d9-57f63c4c96c6' before making global git config changes
    587:  Adding repository directory to the temporary git global config as a safe directory
    588:  [command]/usr/bin/git config --global --add safe.directory /home/runner/work/code-graph-backend/code-graph-backend
    589:  [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand
    590:  [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :"
    591:  fatal: No url found for submodule path 'tests/git_repo' in .gitmodules
    592:  ##[warning]The process '/usr/bin/git' failed with exit code 128
    593:  Cleaning up orphan processes
    

    Copy link
    Contributor

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 0

    🔭 Outside diff range comments (1)
    api/graph.py (1)

    482-498: 🛠️ Refactor suggestion

    Fix mutable default argument in connect_entities method.

    Using a mutable object ({}) as a default argument value can lead to unexpected behavior because the default value is evaluated only once when the function is defined, not every time the function is called.

    Apply this change to fix the issue:

    -def connect_entities(self, relation: str, src_id: int, dest_id: int, properties: dict = {}) -> None:
    +def connect_entities(self, relation: str, src_id: int, dest_id: int, properties: dict = None) -> None:
        """
        Establish a relationship between src and dest
    
        Args:
            src_id (int): ID of the source node.
            dest_id (int): ID of the destination node.
        """
    
        q = f"""MATCH (src), (dest)
                 WHERE ID(src) = $src_id AND ID(dest) = $dest_id
                 MERGE (src)-[e:{relation}]->(dest)
                 SET e += $properties
                 RETURN e"""
    
    -    params = {'src_id': src_id, 'dest_id': dest_id, "properties": properties}
    +    params = {'src_id': src_id, 'dest_id': dest_id, "properties": properties if properties is not None else {}}
        self._query(q, params)
    🧰 Tools
    🪛 Ruff (0.8.2)

    482-482: Do not use mutable data structures for argument defaults

    Replace with None; initialize within function

    (B006)

    🧹 Nitpick comments (2)
    api/analyzers/source_analyzer.py (1)

    146-163: Enhance readability for the resolved symbol selection logic.

    The changes look good overall, with effective use of the new Symbol class and adding properties to relationships. However, selecting just the first resolved symbol with next(iter(symbol.resolved_symbol)) could use some explanation for future maintainers.

    Consider adding a comment to explain this pattern:

    -            resolved_symbol = next(iter(symbol.resolved_symbol))
    +            # Take the first resolved symbol - in our analyzer context, 
    +            # each symbol should resolve to at most one entity
    +            resolved_symbol = next(iter(symbol.resolved_symbol))
    api/entities/entity.py (1)

    4-10: Add docstrings to the new Symbol class.

    The introduction of the Symbol class is a great design choice that nicely encapsulates a symbol and its resolved counterparts.

    Consider adding docstrings to improve code documentation:

     class Symbol:
    +    """
    +    Encapsulates a tree-sitter Node representing a symbol and its resolved symbols.
    +    
    +    A symbol is a reference to an entity in the codebase, such as a function or class name.
    +    Resolved symbols are the actual entities that the symbol refers to.
    +    """
         def __init__(self, symbol: Node):
    +        """
    +        Initialize a Symbol with a tree-sitter Node.
    +        
    +        Args:
    +            symbol (Node): The tree-sitter Node representing the symbol.
    +        """
             self.symbol = symbol
             self.resolved_symbol = set()
    
         def add_resolve_symbol(self, resolved_symbol):
    +        """
    +        Add a resolved symbol to this Symbol's set of resolved symbols.
    +        
    +        Args:
    +            resolved_symbol: The resolved entity that this symbol refers to.
    +        """
             self.resolved_symbol.add(resolved_symbol)
    📜 Review details

    Configuration used: CodeRabbit UI
    Review profile: CHILL
    Plan: Pro

    📥 Commits

    Reviewing files that changed from the base of the PR and between 2d71c19 and 1501945.

    📒 Files selected for processing (3)
    • api/analyzers/source_analyzer.py (1 hunks)
    • api/entities/entity.py (1 hunks)
    • api/graph.py (2 hunks)
    🧰 Additional context used
    🪛 Ruff (0.8.2)
    api/graph.py

    482-482: Do not use mutable data structures for argument defaults

    Replace with None; initialize within function

    (B006)

    🔇 Additional comments (3)
    api/entities/entity.py (3)

    15-15: LGTM: Updated type annotation for symbols.

    The updated type annotation correctly reflects the change from a dictionary of Node lists to a dictionary of Symbol lists.


    21-21: LGTM: Updated to use the new Symbol class.

    This correctly wraps the Node in a Symbol object before adding it to the list.


    30-31: LGTM: Correctly integrates with the Symbol class.

    The code now properly passes the Node from the Symbol for resolution, then adds resolved symbols to the Symbol's collection.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants