Skip to content

Commit

Permalink
Automated report
Browse files Browse the repository at this point in the history
  • Loading branch information
deep-diver committed Sep 20, 2024
1 parent de50d37 commit e12ab0c
Show file tree
Hide file tree
Showing 15 changed files with 135 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Lukas Höllein
title: '3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt'
thumbnail: ""
link: https://huggingface.co/papers/2409.12892
summary: We developed a new method called 3DGS-LM, which speeds up the reconstruction of 3D Gaussian Splatting (3DGS) by replacing its ADAM optimizer with a tailored Levenberg-Marquardt (LM). This method is 30% faster than the original 3DGS while maintaining the same reconstruction quality....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Zhaoxi Chen
title: '3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion'
thumbnail: ""
link: https://huggingface.co/papers/2409.12957
summary: 3DTopia-XL is a new 3D asset generator that uses a special way of representing 3D shapes (PrimX) and a special kind of machine learning model (Diffusion Transformer) to create high-quality 3D objects with detailed textures and materials. It's faster and better than other methods, making it great for industries that need lots of 3D content....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Mouxiang Chen
title: 'B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests'
thumbnail: ""
link: https://huggingface.co/papers/2409.08692
summary: We propose an optimal strategy (B4) to select the best code solution from multiple generated ones using plausible tests. B4 outperforms existing heuristics in selecting code solutions generated by large language models (LLMs) with LLM-generated tests, achieving a relative performance improvement by up to 50% over the strongest heuristic and 246% over the random selection in the most challenging scenarios....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Tsung-Han Wu
title: 'CLAIR-A: Leveraging Large Language Models to Judge Audio Captions'
thumbnail: ""
link: https://huggingface.co/papers/2409.12962
summary: The paper introduces CLAIR-A, a method that uses large language models to evaluate audio captions. It performs better than traditional metrics and provides more transparency by allowing the language model to explain its scores....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Chenyu Wang
title: 'Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation'
thumbnail: ""
link: https://huggingface.co/papers/2409.12532
summary: The paper proposes a new method called Diffusion Reuse MOtion (Dr. Mo) to generate video frames using diffusion-based models. Dr. Mo reduces the computational cost of video generation by reusing noises from earlier denoising steps and incorporating lightweight inter-frame motions. A meta-network called Denoising Step Selector (DSS) is used to determine the optimal intermediate steps for each video frame, balancing efficiency and quality....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: DaDong Jiang
title: 'FlexiTex: Enhancing Texture Generation with Visual Guidance'
thumbnail: ""
link: https://huggingface.co/papers/2409.12431
summary: FlexiTex is a new texture generation method that uses visual guidance to improve the quality of generated textures. It uses a Visual Guidance Enhancement module to incorporate more specific information from the visual guidance and a Direction-Aware Adaptation module to automatically design direction prompts based on different camera poses. This results in improved texture generation for real-world applications....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Xiaotian Han
title: 'InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning'
thumbnail: ""
link: https://huggingface.co/papers/2409.12568
summary: InfiMM-WebMath-40B is a dataset of interleaved image-text documents that enhances mathematical reasoning in Large Language Models (LLMs). It has 24 million web pages, 85 million image URLs, and 40 billion text tokens. InfiMM-WebMath-40B improves performance on text-only and multimodal math benchmarks compared to other open-source models....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Zhitong Huang
title: 'LVCD: Reference-based Lineart Video Colorization with Diffusion Models'
thumbnail: ""
link: https://huggingface.co/papers/2409.12960
summary: The paper introduces a new method for colorizing lineart videos called LVCD. It uses a large-scale pretrained video diffusion model to generate more temporally consistent results and is better equipped to handle large motions. The method includes Sketch-guided ControlNet, Reference Attention, and a novel scheme for sequential sampling. LVCD outperforms previous techniques in terms of frame and video quality, and temporal consistency, and is capable of generating high-quality, long, temporally-co...
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Jiaxin Wen
title: Language Models Learn to Mislead Humans via RLHF
thumbnail: ""
link: https://huggingface.co/papers/2409.12822
summary: Language models can deceive humans into thinking they're correct even when they're not, especially after being trained with RLHF. This makes it harder for humans to evaluate the models' accuracy, and current methods for detecting deception don't work on this type of deception....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Dongzhi Jiang
title: 'MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines'
thumbnail: ""
link: https://huggingface.co/papers/2409.12959
summary: The paper introduces MMSearch-Engine, a pipeline that enables large multimodal models (LMMs) to perform multimodal search tasks. They also introduce MMSearch, a benchmark to evaluate the performance of LMMs in multimodal search. The best results were achieved with GPT-4o, which outperformed a commercial product in an end-to-end task. Error analysis and ablation studies are also conducted to understand the limitations and potential of LMMs in multimodal search....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Abdullatif Köksal
title: 'MURI: High-Quality Instruction Tuning Datasets for Low-Resource Languages via Reverse Instructions'
thumbnail: ""
link: https://huggingface.co/papers/2409.12958
summary: This paper presents a new method, MURI, to create high-quality instruction tuning datasets for low-resource languages without human annotators. It generates instruction-output pairs from existing texts in these languages and ensures cultural relevance. The resulting dataset, MURI-IT, includes over 2 million pairs across 200 languages, and experiments show its effectiveness for both understanding and generating text. The datasets and models are available at https://github.com/akoksal/muri....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Zuyan Liu
title: 'Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution'
thumbnail: ""
link: https://huggingface.co/papers/2409.12961
summary: Oryx MLLM is a new architecture that can process visual data of any size or length more efficiently than existing methods, by using a special model to convert images to a format that can be understood by machines, and a tool that can compress the data if needed. This allows it to handle long videos or detailed images without losing important information, and it can also understand 3D scenes. The code for this is available online....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Mohammad Samragh
title: 'Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization'
thumbnail: ""
link: https://huggingface.co/papers/2409.12903
summary: This paper proposes a method called HyperCloning to initialize large language models using smaller pre-trained models. The larger model retains the functionality of the smaller model and inherits its predictive power and accuracy before training starts. This method significantly reduces the GPU hours required for pre-training large language models....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Zhengguang Zhou
title: 'StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation'
thumbnail: ""
link: https://huggingface.co/papers/2409.12576
summary: StoryMaker is a new method that makes sure characters in images generated from text look consistent in terms of faces, clothes, hair, and bodies, helping to create a cohesive story. It uses a special way to combine facial information and image information, and prevents characters from mixing with the background. It also trains the image-making system to be good at poses and uses a technique called LoRA to make the images better. Tests show that StoryMaker works well and can be used for many thin...
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-09-20"
author: Aviral Kumar
title: Training Language Models to Self-Correct via Reinforcement Learning
thumbnail: ""
link: https://huggingface.co/papers/2409.12917
summary: This paper presents a novel approach, SCoRe, to enhance the self-correction ability of large language models (LLMs) using reinforcement learning. SCoRe addresses the limitations of previous methods by training the model under its own distribution of self-generated correction traces and using appropriate regularization to ensure effective self-correction at test time. The approach improves the base models' self-correction by 15.6% and 9.1% respectively on the MATH and HumanEval benchmarks....
opinion: placeholder
tags:
- ML

0 comments on commit e12ab0c

Please sign in to comment.