Add hooked transformer generate stream #908

anthonyduong9 · 2025-04-11T06:49:20Z

Description

Adds a new method HookedTransformer.generate_stream(). We wanted to add this in #847, but hadn't added tests, and also want to complete hijohnnylin/neuronpedia#51. @hijohnnylin said to open a PR, and if we merge this, we can replace a fork of TransformerLens with the latest version of transformer-lens as a dependency in neuronpedia.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist:

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

hijohnnylin · 2025-05-05T21:17:24Z

Before we merge this:

What was this new test - was it just merged from a different branch? It doesn't seem relevant test_bloom_similarity_with_hf_model_with_kv_cache_activated_stream
Make outstanding issues to resolve afterwards - @anthonyduong9 can you create new issue(s) for longer term fixes for this?

Reducing the duplicated code between generate and generate_stream
Adding tests for generate_stream

bryce13950 · 2025-05-05T21:30:30Z

@hijohnnylin The test is from this commit in this PR b7bce69

anthonyduong9 · 2025-05-06T05:28:34Z

Before we merge this:

What was this new test - was it just merged from a different branch? It doesn't seem relevant test_bloom_similarity_with_hf_model_with_kv_cache_activated_stream

Make outstanding issues to resolve afterwards - @anthonyduong9 can you create new issue(s) for longer term fixes for this?

Reducing the duplicated code between generate and generate_stream

Adding tests for generate_stream

I added the test in this PR. It tests that the last value of generate_stream() matches the output of AutoModelForCausalLM.generate(). It's analogous to the test for generate().

TransformerLens/tests/acceptance/test_hooked_transformer.py

Lines 178 to 195 in 47fe156

    
           def test_bloom_similarity_with_hf_model_with_kv_cache_activated(): 
        
               tf_model = HookedTransformer.from_pretrained( 
        
                   "bigscience/bloom-560m", default_prepend_bos=False, device="cpu" 
        
               ) 
        
               hf_model = AutoModelForCausalLM.from_pretrained("bigscience/bloom-560m") 
        
               hf_tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom-560m") 
        
               output_tf = tf_model.generate( 
        
                   text, do_sample=False, use_past_kv_cache=True, verbose=False, max_new_tokens=10 
        
               ) 
        
               output_hf_tokens = hf_model.generate( 
        
                   hf_tokenizer(text, return_tensors="pt").input_ids, 
        
                   do_sample=False, 
        
                   max_new_tokens=10, 
        
               ) 
        
               output_hf_str = hf_tokenizer.decode(output_hf_tokens[0], skip_special_tokens=True) 
        
               assert output_tf == output_hf_str

So I think that completes 2. You and I talked about 1 shortly after I opened this PR, in person. I'm not sure it's worth the effort - I spent a lot of time trying to dedupe code between the two functions for this PR, and not only did functions stop working as expected, but extracting abstractions was awkward. This is probably because generate() is a lot less similar to generate_stream() than when you first wrote the latter (after #820).

anthonyduong9 force-pushed the add-HookedTransformer-generate_stream branch 2 times, most recently from 9fcf45a to b86b035 Compare April 11, 2025 23:15

adds HookedTransformer.generate_stream()

b7bce69

anthonyduong9 force-pushed the add-HookedTransformer-generate_stream branch from b86b035 to b7bce69 Compare April 11, 2025 23:16

anthonyduong9 marked this pull request as ready for review April 12, 2025 00:29

Merge branch 'dev' into add-HookedTransformer-generate_stream

3cc5554

Merge branch 'dev' into add-HookedTransformer-generate_stream

468b9c2

bryce13950 mentioned this pull request May 7, 2025

feat: streaming response for HookedTransformer.generate #847

Closed

7 tasks

bryce13950 added 2 commits June 16, 2025 18:53

Merge branch 'dev' into add-HookedTransformer-generate_stream

f573980

Merge branch 'dev' into add-HookedTransformer-generate_stream

6f6b391

bryce13950 added the pr-typing-issues label Jun 19, 2025

fixes mypy errors

2fe136b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add hooked transformer generate stream #908

Add hooked transformer generate stream #908

anthonyduong9 commented Apr 11, 2025 •

edited

Loading

Uh oh!

hijohnnylin commented May 5, 2025

Uh oh!

bryce13950 commented May 5, 2025

Uh oh!

anthonyduong9 commented May 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

Add hooked transformer generate stream #908

Are you sure you want to change the base?

Add hooked transformer generate stream #908

Conversation

anthonyduong9 commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Screenshots

Checklist:

Uh oh!

hijohnnylin commented May 5, 2025

Uh oh!

bryce13950 commented May 5, 2025

Uh oh!

anthonyduong9 commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

anthonyduong9 commented Apr 11, 2025 •

edited

Loading

anthonyduong9 commented May 6, 2025 •

edited

Loading