{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":768612800,"defaultBranch":"main","name":"lmms-eval","ownerLogin":"EvolvingLMMs-Lab","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2024-03-07T12:09:25.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/154951679?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1727324924.0","currentOid":""},"activityList":{"items":[{"before":"bae0fee3cdb973b59935865e229ef3950a8b268b","after":null,"ref":"refs/heads/pufanyi/generate_until_multi_round_fix","pushedAt":"2024-09-26T04:28:44.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"pufanyi","name":"Pu Fanyi","path":"/pufanyi","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/44887683?s=80&v=4"}},{"before":"9d227f74be3ccf0f19b00f278f3381e67d0aa0c9","after":"ff0802ccf4afb97245c55eaf5b733780370f83a5","ref":"refs/heads/main","pushedAt":"2024-09-26T04:28:03.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"[Fix] Fix the error when running models caused by `generate_until_multi_round` (#281)\n\n* fix\r\n\r\n* lint","shortMessageHtmlLink":"[Fix] Fix the error when running models caused by `generate_until_mul…"}},{"before":"dfbddf814732276b49e427350fd526d507ef23ad","after":"9d227f74be3ccf0f19b00f278f3381e67d0aa0c9","ref":"refs/heads/main","pushedAt":"2024-09-26T04:27:30.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"[Feat] Add support for evaluation of Oryx models (#276)\n\n* support Oryx models\r\n\r\n* update oryx model\r\n\r\n* update oryx\r\n\r\n* update oryx\r\n\r\n---------\r\n\r\nCo-authored-by: dongyh20 <1342229580@qq.com>","shortMessageHtmlLink":"[Feat] Add support for evaluation of Oryx models (#276)"}},{"before":null,"after":"bae0fee3cdb973b59935865e229ef3950a8b268b","ref":"refs/heads/pufanyi/generate_until_multi_round_fix","pushedAt":"2024-09-25T19:38:08.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"pufanyi","name":"Pu Fanyi","path":"/pufanyi","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/44887683?s=80&v=4"},"commit":{"message":"lint","shortMessageHtmlLink":"lint"}},{"before":"c01d2f22145a8190d8cd92f01d630ed4908f8d69","after":null,"ref":"refs/heads/fix/model_name_mix_evals","pushedAt":"2024-09-25T07:52:20.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"}},{"before":"65d7db422bcba5bc8e3602a165bb5fb4f46b1b6d","after":"dfbddf814732276b49e427350fd526d507ef23ad","ref":"refs/heads/main","pushedAt":"2024-09-25T07:52:16.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"[Fix] Model name None in Task manager, mix eval model specific kwargs, claude retrying fix (#278)\n\n* Fix task manager model name None issue\r\n\r\n* Change model specific args in mix eval to lmms eval kwargs\r\n\r\n* json dump indent 4\r\n\r\n* lint\r\n\r\n* Fix claude always retrying 5 time error","shortMessageHtmlLink":"[Fix] Model name None in Task manager, mix eval model specific kwargs…"}},{"before":"259e4942e550d8a2af40a31117e72ce1211d685d","after":"65d7db422bcba5bc8e3602a165bb5fb4f46b1b6d","ref":"refs/heads/main","pushedAt":"2024-09-25T07:49:59.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"[Feat][Task] Add multi-round evaluation in llava-onevision; Add MMSearch Benchmark (#277)\n\n* Add MMSearch. Add generate_until_multi_round function in LLaVA-OneVision\r\n\r\n* Fix linting error\r\n\r\n* Update mutli-gpu end2end inference support.\r\nUpdate README.md about multi-round interaction.\r\n\r\n* Fix linting error\r\n\r\n* Fix linting error","shortMessageHtmlLink":"[Feat][Task] Add multi-round evaluation in llava-onevision; Add MMSea…"}},{"before":"a134a74432bc2c1cb308e7369d42d1cfc27f5d43","after":"c01d2f22145a8190d8cd92f01d630ed4908f8d69","ref":"refs/heads/fix/model_name_mix_evals","pushedAt":"2024-09-24T13:59:58.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"Fix claude always retrying 5 time error","shortMessageHtmlLink":"Fix claude always retrying 5 time error"}},{"before":"0f5ffa2b4e4d4aeb834ae9a5b90d5f5e75e787b5","after":"a134a74432bc2c1cb308e7369d42d1cfc27f5d43","ref":"refs/heads/fix/model_name_mix_evals","pushedAt":"2024-09-24T13:50:31.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"lint","shortMessageHtmlLink":"lint"}},{"before":"06144f33f211fb0b6576d5ffcf2cc44a717b5a79","after":"0f5ffa2b4e4d4aeb834ae9a5b90d5f5e75e787b5","ref":"refs/heads/fix/model_name_mix_evals","pushedAt":"2024-09-24T13:45:09.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"json dump indent 4","shortMessageHtmlLink":"json dump indent 4"}},{"before":null,"after":"06144f33f211fb0b6576d5ffcf2cc44a717b5a79","ref":"refs/heads/fix/model_name_mix_evals","pushedAt":"2024-09-24T13:40:23.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"json dump indent 4","shortMessageHtmlLink":"json dump indent 4"}},{"before":"559bd0178c4879e173b2d9e84ebab6048c07eac5","after":"259e4942e550d8a2af40a31117e72ce1211d685d","ref":"refs/heads/main","pushedAt":"2024-09-24T12:56:54.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"[feat] support video evaluation for qwen2-vl and add mix-evals-video2text (#275)\n\n* feat: add new ouput_path saving logic and add evaluation tracker to manage samples saving process\r\n\r\n* add: regression test\r\n\r\n* add: regression test\r\n\r\n* clean: unuseful code\r\n\r\n* 🚫 Remove unused import for cleaner code\r\n\r\nEliminated the commented-out import statement for WandbLogger to tidy up the code and enhance readability. This helps maintain focus on active components and prevents confusion over unused code. A cleaner structure contributes to better maintainability in the long run.\r\n\r\nNo functional changes were made, just a step towards a more streamlined codebase.\r\n\r\n* [task] add mix_evals for video evaluation\r\n\r\n* Merge branch 'origin/main'\r\n\r\n* ✨ Improve model name sanitization for Hugging Face formats\r\n\r\n* 🧹 Refactor settings for Llava OneVision model\r\n\r\n* ✨ Enhance video and image processing capabilities\r\n\r\n- Integrated vision processing for videos and images, improving context handling within the model.\r\n- Added error logging for missing utility dependencies to inform users about installation requirements.\r\n- Updated YAML configuration to standardize prompt handling for various video tasks.\r\n- Bumped version number to indicate ongoing development status.\r\n\r\nThese changes streamline how visuals are managed in the model, contributing to better assistant responses in tasks involving media.\r\n\r\n* 🎉 Enhance W&B logging and video playback\r\n\r\n- Added automatic naming for W&B runs if not specified, improving organization.\r\n- Updated video frame rate from 1.0 to 0.5 for better performance and resource management during visual content processing.\r\n- Streamlined W&B logging by removing redundant code, ensuring cleaner execution flow.\r\n\r\nThese changes optimize logging efficiency and enhance the overall user experience.\r\n\r\n* ✨ Refine conversation logic and adjust token limits\r\n\r\n- Updated chat template logic for better formatting in responses, ensuring consistent handling of user and assistant roles.\r\n- Reduced maximum new tokens in multiple evaluation files to ensure more concise outputs and improve efficiency.\r\n- Enhanced clarity in few-shot tasks by explicitly labeling question and answer roles in generated text.\r\n- Simplified logging of contextual and target information during evaluation, ensuring better tracking of results.\r\n\r\nThese adjustments improve the overall output quality and streamline the evaluation processes.\r\n\r\n* feat: change qwen2 vl video reading to 0.25 fps to avoid oom\r\n\r\n* 🎥 Update video message structure in Qwen2_VL\r\n\r\n* Update qwen2_vl.py","shortMessageHtmlLink":"[feat] support video evaluation for qwen2-vl and add mix-evals-video2…"}},{"before":"34b35cd3c29a963b0628fd33e5030ef9ebc2f2a4","after":"f0f5e62cedba04f9e104bd091212fff178c52efb","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-24T12:55:36.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"Update qwen2_vl.py","shortMessageHtmlLink":"Update qwen2_vl.py"}},{"before":"638b2e0cab8a7f6b6e119b6f5ba89f1e8b50be39","after":"34b35cd3c29a963b0628fd33e5030ef9ebc2f2a4","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-24T08:05:33.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"🎥 Update video message structure in Qwen2_VL","shortMessageHtmlLink":"🎥 Update video message structure in Qwen2_VL"}},{"before":"45060353fd7048311167e664cb5e583dc1c7ed92","after":"638b2e0cab8a7f6b6e119b6f5ba89f1e8b50be39","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-24T03:43:49.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"feat: change qwen2 vl video reading to 0.25 fps to avoid oom","shortMessageHtmlLink":"feat: change qwen2 vl video reading to 0.25 fps to avoid oom"}},{"before":"92b604490676bc31be4c9543d8a026d3ecb99774","after":"45060353fd7048311167e664cb5e583dc1c7ed92","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-23T19:56:53.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"✨ Refine conversation logic and adjust token limits\n\n- Updated chat template logic for better formatting in responses, ensuring consistent handling of user and assistant roles.\n- Reduced maximum new tokens in multiple evaluation files to ensure more concise outputs and improve efficiency.\n- Enhanced clarity in few-shot tasks by explicitly labeling question and answer roles in generated text.\n- Simplified logging of contextual and target information during evaluation, ensuring better tracking of results.\n\nThese adjustments improve the overall output quality and streamline the evaluation processes.","shortMessageHtmlLink":"✨ Refine conversation logic and adjust token limits"}},{"before":"d96f60dc752051c8d49d16443c4a98e3614021e4","after":"92b604490676bc31be4c9543d8a026d3ecb99774","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-23T16:50:33.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"🎉 Enhance W&B logging and video playback\n\n- Added automatic naming for W&B runs if not specified, improving organization.\n- Updated video frame rate from 1.0 to 0.5 for better performance and resource management during visual content processing.\n- Streamlined W&B logging by removing redundant code, ensuring cleaner execution flow.\n\nThese changes optimize logging efficiency and enhance the overall user experience.","shortMessageHtmlLink":"🎉 Enhance W&B logging and video playback"}},{"before":"c0c0bbaecd58aa24506e6edacae25076d3a47017","after":"559bd0178c4879e173b2d9e84ebab6048c07eac5","ref":"refs/heads/main","pushedAt":"2024-09-23T08:07:06.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"Update commands.md","shortMessageHtmlLink":"Update commands.md"}},{"before":"ec2e6f12f6a83dedf29ed8f320990341fdfebaf4","after":"3beffaf6a5f5ab4bf9e9b216a30f2aef4d7cf135","ref":"refs/heads/kr/video","pushedAt":"2024-09-23T03:47:00.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"KairuiHu","name":"Hukairui","path":"/KairuiHu","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/71913595?s=80&v=4"},"commit":{"message":"[Bug] Fix the checking error in gpt4v generation (#274)\n\n* fix\r\n\r\n* Fix error handling in GPT4V class","shortMessageHtmlLink":"[Bug] Fix the checking error in gpt4v generation (#274)"}},{"before":"7fc3fbb4d400a57a05c4fdd34d7c686e5529f7fd","after":"ec2e6f12f6a83dedf29ed8f320990341fdfebaf4","ref":"refs/heads/kr/video","pushedAt":"2024-09-23T03:43:07.000Z","pushType":"push","commitsCount":12,"pusher":{"login":"KairuiHu","name":"Hukairui","path":"/KairuiHu","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/71913595?s=80&v=4"},"commit":{"message":"Update current tasks.md (#272)\n\n* update and sort tasks\r\n\r\n* Update current_tasks.md\r\n\r\n* minor changes","shortMessageHtmlLink":"Update current tasks.md (#272)"}},{"before":"61aa007db8d157facc90fa0f5936b0742f05da74","after":"d96f60dc752051c8d49d16443c4a98e3614021e4","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-22T18:11:35.000Z","pushType":"push","commitsCount":18,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"Merge branch 'main' into dev/fix_output_path","shortMessageHtmlLink":"Merge branch 'main' into dev/fix_output_path"}},{"before":"7646dbca1f7ad67b99a90819c282723c36c85e21","after":"61aa007db8d157facc90fa0f5936b0742f05da74","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-22T17:57:47.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"✨ Enhance video and image processing capabilities\n\n- Integrated vision processing for videos and images, improving context handling within the model.\n- Added error logging for missing utility dependencies to inform users about installation requirements.\n- Updated YAML configuration to standardize prompt handling for various video tasks.\n- Bumped version number to indicate ongoing development status.\n\nThese changes streamline how visuals are managed in the model, contributing to better assistant responses in tasks involving media.","shortMessageHtmlLink":"✨ Enhance video and image processing capabilities"}},{"before":"21a906accae208779a16d4d0fb7b7bf5b54f4fbb","after":"7646dbca1f7ad67b99a90819c282723c36c85e21","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-22T14:50:26.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"🧹 Refactor settings for Llava OneVision model","shortMessageHtmlLink":"🧹 Refactor settings for Llava OneVision model"}},{"before":"b198dd7eb89b25bd2b307bd9cc40752f144396d5","after":"21a906accae208779a16d4d0fb7b7bf5b54f4fbb","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-22T14:30:32.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"✨ Improve model name sanitization for Hugging Face formats","shortMessageHtmlLink":"✨ Improve model name sanitization for Hugging Face formats"}},{"before":"74522384310a79ce0cc933a6498d3a941bcddd57","after":"b198dd7eb89b25bd2b307bd9cc40752f144396d5","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-22T14:21:08.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"Merge branch 'origin/main'","shortMessageHtmlLink":"Merge branch 'origin/main'"}},{"before":"920c211c7d4a01ff3c81fb2967d53272b45c8b69","after":"74522384310a79ce0cc933a6498d3a941bcddd57","ref":"refs/heads/dev/fix_output_path","pushedAt":"2024-09-22T14:19:42.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"[task] add mix_evals for video evaluation","shortMessageHtmlLink":"[task] add mix_evals for video evaluation"}},{"before":"f30ad8fd2524338e55b7f4b1b673eba55a92ef5e","after":null,"ref":"refs/heads/pufanyi/gpt4v_fix","pushedAt":"2024-09-22T11:12:54.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"pufanyi","name":"Pu Fanyi","path":"/pufanyi","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/44887683?s=80&v=4"}},{"before":"0bae7fa803c47883d20999b56b8fd9b4a41ea8c9","after":"c0c0bbaecd58aa24506e6edacae25076d3a47017","ref":"refs/heads/main","pushedAt":"2024-09-22T09:50:24.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"Update current tasks.md (#272)\n\n* update and sort tasks\r\n\r\n* Update current_tasks.md\r\n\r\n* minor changes","shortMessageHtmlLink":"Update current tasks.md (#272)"}},{"before":"f0b6a8ea485c32048f4eface4eaa7413cec0ea0a","after":"0bae7fa803c47883d20999b56b8fd9b4a41ea8c9","ref":"refs/heads/main","pushedAt":"2024-09-22T09:50:05.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"Support new task mmworld (#269)\n\n* support new task mmworld\r\n\r\n* Apply linting fixes","shortMessageHtmlLink":"Support new task mmworld (#269)"}},{"before":"be9e46c7d47ad390901aaf1562ac4efdef980449","after":"f0b6a8ea485c32048f4eface4eaa7413cec0ea0a","ref":"refs/heads/main","pushedAt":"2024-09-22T09:49:55.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Luodian","name":"Li Bo","path":"/Luodian","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15847405?s=80&v=4"},"commit":{"message":"[Model] support Qwen2 VL (#268)\n\n* add qwen2-vl\r\n\r\n* qwen2 vl (black isort)\r\n\r\n* qwen2 vl black\r\n\r\n* black\r\n\r\n* without qwen vl utils and temp images\r\n\r\n* black\r\n\r\n* isort\r\n\r\n* qwen2 vl batch generate\r\n\r\n* remove unused import\r\n\r\n* remove unreferenced","shortMessageHtmlLink":"[Model] support Qwen2 VL (#268)"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"startCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0yNlQwNDoyODo0NC4wMDAwMDBazwAAAATBCJT_","endCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0yMlQwOTo0OTo1NS4wMDAwMDBazwAAAAS9G4o8"}},"title":"Activity · EvolvingLMMs-Lab/lmms-eval"}