Skip to content

Fix performance issue: cache agent knowledge to avoid reloading on every kickoff #3077

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

devin-ai-integration[bot]
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Jun 27, 2025

Fix: Cache agent knowledge to prevent unnecessary reloading on repeated kickoffs

Summary

This PR implements a caching mechanism in the Agent.set_knowledge() method to resolve a significant performance issue where agent knowledge was being reloaded on every crew kickoff operation. The issue was occurring in crew.py line 645 where knowledge sources were being processed (chunked, embedded, stored) unnecessarily on each kickoff, causing substantial performance overhead.

Key Changes:

  • Added knowledge state tracking with private attributes _knowledge_loaded, _last_embedder, _last_knowledge_sources
  • Modified set_knowledge() to skip reloading when knowledge hasn't changed
  • Added reset_knowledge_cache() method for explicit cache clearing when needed
  • Added comprehensive test coverage for caching behavior and edge cases

The caching mechanism intelligently detects when knowledge needs to be reloaded (when sources or embedder changes) while preventing redundant processing when the same agent is used across multiple kickoffs.

Review & Testing Checklist for Human

  • Verify cache invalidation logic - Test that knowledge is properly reloaded when knowledge sources or embedder configurations change, and NOT reloaded when they stay the same
  • End-to-end performance testing - Create a crew with knowledge sources and run multiple kickoffs to verify the performance improvement actually occurs
  • Test edge cases - Verify behavior with different knowledge source types, embedder configurations, and the reset_knowledge_cache() method
  • Backward compatibility - Ensure existing workflows still work correctly with the new caching behavior

Recommended Test Plan:

  1. Create an agent with knowledge sources (e.g., StringKnowledgeSource)
  2. Run crew.kickoff() multiple times and measure/verify that knowledge loading only happens once
  3. Change knowledge sources mid-way and verify knowledge gets reloaded
  4. Test with different embedder configurations to ensure cache invalidation works

Diagram

graph TD
    crew[src/crewai/crew.py]
    agent[src/crewai/agent.py]:::major-edit
    knowledge[src/crewai/knowledge/knowledge.py]:::context
    agent_tests[tests/agent_test.py]:::major-edit
    
    crew -->|calls set_knowledge| agent
    agent -->|creates/caches| knowledge
    agent_tests -->|tests caching behavior| agent
    
    subgraph "Agent Caching Logic"
        cache_check[Check _knowledge_loaded flag]
        compare_state[Compare _last_embedder & _last_knowledge_sources]
        skip_load[Skip knowledge loading]
        load_knowledge[Load knowledge & update cache]
        
        cache_check --> compare_state
        compare_state -->|same| skip_load
        compare_state -->|different| load_knowledge
    end
    
    subgraph Legend
        L1[Major Edit]:::major-edit
        L2[Minor Edit]:::minor-edit  
        L3[Context/No Edit]:::context
    end

classDef major-edit fill:#90EE90
classDef minor-edit fill:#87CEEB
classDef context fill:#FFFFFF
Loading

Notes

  • Performance Impact: This fix addresses issue [BUG]crew.py reloads memory on every kickoff causing performance issues #3076 where repeated kickoffs caused significant performance degradation due to unnecessary knowledge reprocessing
  • Cache Strategy: Uses simple state comparison (embedder config + knowledge sources) to determine when cache is valid
  • Memory Considerations: Cache stores references to knowledge sources and embedder configs - monitor for potential memory usage in long-running applications
  • Thread Safety: Current implementation is not thread-safe - consider this if agents are used in multi-threaded environments

- Add caching mechanism in Agent.set_knowledge to track loaded state
- Skip knowledge reloading when sources and embedder haven't changed
- Add reset_knowledge_cache method for explicit cache clearing
- Add comprehensive tests for caching behavior and edge cases
- Fixes issue #3076 performance overhead on repeated kickoffs

Co-Authored-By: João <joao@crewai.com>
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@joaomdmoura
Copy link
Collaborator

Disclaimer: This review was made by a crew of AI Agents.

Code Review Comments on Knowledge Caching Implementation

Overview

This pull request effectively implements a caching strategy for knowledge in the Agent class to optimize performance during repeated operations. The changes are made to agent.py and agent_test.py, and while the implementation demonstrates good practices, several key areas could benefit from refinement.


File: src/crewai/agent.py

Positive Aspects

  • Effective Caching Mechanism: The integration of a caching mechanism using instance attributes is well-conceived. This minimizes repetitive loading during multiple agent kickoffs.
  • Clear Cache Invalidation Logic: The logic to invalidate the cache is structured clearly, ensuring that the state is accurately managed.
  • Error Handling: The implementation incorporates specific exceptions which strengthen the robustness of the functionality.
  • Added Reset Functionality: A reset feature allows manual management of the cache, enhancing flexibility.

Areas for Improvement

  1. Cache State Variables Naming

    • Current:
      self._knowledge_loaded = True
      self._last_embedder = current_embedder
      self._last_knowledge_sources = self.knowledge_sources
    • Suggested:
      self.__knowledge_cache = {
          'loaded': True,
          'embedder': current_embedder,
          'sources': self.knowledge_sources.copy() if self.knowledge_sources else None
      }
    • Reasoning: Consolidating cache-related variables into a single attribute enhances encapsulation and clarity.
  2. Cache Validation Logic Extraction

    • Suggested Implementation:
      def _is_knowledge_cache_valid(self, current_embedder):
          if not hasattr(self, '__knowledge_cache'):
              return False
          return (self.__knowledge_cache['loaded'] and 
                  self.knowledge is not None and
                  self.__knowledge_cache['embedder'] == current_embedder and
                  self.__knowledge_cache['sources'] == self.knowledge_sources)
    • Reasoning: Extracting validation into a separate method can improve readability and maintainability.
  3. Knowledge Sources Copying

    • Current:
      self._last_knowledge_sources = self.knowledge_sources.copy() if self.knowledge_sources else None
    • Suggested:
      self._last_knowledge_sources = copy.deepcopy(self.knowledge_sources) if self.knowledge_sources else None
    • Reasoning: deepcopy ensures that the cached sources are entirely independent of the original sources, safeguarding integrity.
  4. Type Hints

    • Suggested Addition:
      def reset_knowledge_cache(self) -> None:
    • Reasoning: Including type hints can enhance documentation and improve IDE support.

File: tests/agent_test.py

Positive Aspects

  • Comprehensive Test Coverage: The test file demonstrates thorough coverage for the new caching functionality, emphasizing both positive and negative cases.
  • Clarity in Test Descriptions: Good test case descriptions provide immediate context.
  • Effective Use of Mocking: The mocking strategies employed enhance the credibility of the tests.

Areas for Improvement

  1. Setup Duplication:

    • Current: Setup code is frequently repeated in tests.
    • Suggested:
      @pytest.fixture
      def test_agent():
          content = "Brandon's favorite color is blue."
          return Agent(
              role="Researcher",
              goal="Research about Brandon",
              backstory="You are a researcher.",
              knowledge_sources=[StringKnowledgeSource(content=content)]
          )
    • Reasoning: Using fixtures minimizes duplication and enhances maintainability.
  2. Assertion Messages:

    • Current:
      assert mock_add_sources.call_count == 1
    • Suggested:
      assert mock_add_sources.call_count == 1, "Knowledge sources should only be loaded once for cached content"
    • Reasoning: Adding messages to assertions helps in debugging failed tests.
  3. Edge Case Coverage:

    • Suggested Additional Test:
      def test_agent_knowledge_cache_with_invalid_sources():
          agent = test_agent()
          agent.knowledge_sources = ["invalid source"]
          
          with pytest.raises(ValueError) as exc_info:
              agent.set_knowledge()
          assert "Invalid Knowledge Configuration" in str(exc_info.value)
    • Reasoning: Adding tests for edge cases ensures your logic can handle errors gracefully.

General Recommendations

  • Documentation: Expand docstrings to explain the caching mechanism and the conditions for cache invalidation. Comments will enhance understanding for future maintenance.
  • Performance Monitoring: Introduce logging for cache hits and misses to optimize performance and resource management, especially for larger datasets.
  • Code Organization: Consider encapsulating caching logic in a dedicated mixin for clarity and reusability. This can simplify the Agent class implementation.

Conclusion

The PR enhances the Agent class's functionality by implementing an effective caching strategy that significantly improves performance. With a few refinements in code quality and testing strategies, the implementation can be made even more robust and maintainable. Overall, excellent work on the implementation!

devin-ai-integration bot and others added 3 commits June 27, 2025 10:01
- Add proper PrivateAttr declarations for cache attributes to fix mypy errors
- Simplify tests to focus on set_knowledge method directly instead of full kickoff
- Remove network calls and invalid method mocking from tests
- All knowledge caching functionality verified working locally

Co-Authored-By: João <joao@crewai.com>
- Remove hasattr/getattr calls that caused mypy type-checker errors
- Fix test mocking to use 'crewai.agent.Knowledge' for proper isolation
- Prevent network calls in tests by mocking Knowledge class constructor
- All knowledge-related tests now pass locally without API dependencies

Co-Authored-By: João <joao@crewai.com>
- Update VCR cassettes for knowledge-related tests
- Ensures CI has consistent test recordings

Co-Authored-By: João <joao@crewai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant