Automated report

deep-diver · Sep 23, 2024 · 5fe634b · 5fe634b
1 parent 37414cd
commit 5fe634b
Show file tree

Hide file tree

Showing 11 changed files with 99 additions and 0 deletions.
diff --git a/current/2024-09-23 Colorful Diffuse Intrinsic Image Decomposition in the Wild.yaml b/current/2024-09-23 Colorful Diffuse Intrinsic Image Decomposition in the Wild.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Chris Careaga
+title: Colorful Diffuse Intrinsic Image Decomposition in the Wild
+thumbnail: ""
+link: https://huggingface.co/papers/2409.13690
+summary: This paper presents a method for separating the different components of an image, such as surface reflectance and lighting effects, even in complex, real-world scenarios. By breaking down the problem into simpler sub-problems, they are able to estimate the lighting effects in images, which can be used for various image editing applications such as removing specular highlights and adjusting white balance....
+opinion: placeholder
+tags:
+    - ML
diff --git a/...9-23 Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation.yaml b/...9-23 Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Satyapriya Krishna
+title: 'Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation'
+thumbnail: ""
+link: https://huggingface.co/papers/2409.12941
+summary: This paper introduces a new evaluation dataset called FRAMES for testing Large Language Models' abilities in retrieval-augmented generation. FRAMES is designed to test the models' ability to provide factual responses, assess retrieval capabilities, and evaluate reasoning required to generate final answers. The dataset includes challenging multi-hop questions that require the integration of information from multiple sources. The paper presents baseline results demonstrating that even state-of-the...
+opinion: placeholder
+tags:
+    - ML
diff --git a/.../2024-09-23 Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments.yaml b/.../2024-09-23 Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Maria Rigaki
+title: 'Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments'
+thumbnail: ""
+link: https://huggingface.co/papers/2409.11276
+summary: Hackphyr is a locally fine-tuned Large Language Model (LLM) agent for network security environments. It runs on a single GPU card and performs as well as larger, more expensive models. It was trained with a new cybersecurity dataset and outperforms other models in complex, unseen scenarios. The paper also analyzes the agent's behavior to understand its planning abilities and limitations....
+opinion: placeholder
+tags:
+    - ML
diff --git a/current/2024-09-23 Imagine yourself: Tuning-Free Personalized Image Generation.yaml b/current/2024-09-23 Imagine yourself: Tuning-Free Personalized Image Generation.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Zecheng He
+title: 'Imagine yourself: Tuning-Free Personalized Image Generation'
+thumbnail: ""
+link: https://huggingface.co/papers/2409.13346
+summary: Imagine yourself is a new model for personalized image generation that doesn't require individual adjustments. It improves on previous models by adding a new data generation method, a new attention architecture, and a new training method. This model is better at preserving identity, following text prompts, and creating visually appealing images compared to other personalization models....
+opinion: placeholder
+tags:
+    - ML
diff --git a/...trel: Structural Prompt Generation with Multi-Agents Coordination for Non-AI Experts.yaml b/...trel: Structural Prompt Generation with Multi-Agents Coordination for Non-AI Experts.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Ming Wang
+title: 'Minstrel: Structural Prompt Generation with Multi-Agents Coordination for Non-AI Experts'
+thumbnail: ""
+link: https://huggingface.co/papers/2409.13449
+summary: This paper introduces LangGPT, a framework for creating structural prompts for LLMs, and Minstrel, a system that uses multiple generative agents to create these prompts. The paper shows that these prompts improve the performance of LLMs and are easy to use, according to a survey of non-AI experts....
+opinion: placeholder
+tags:
+    - ML
diff --git a/current/2024-09-23 MuCodec: Ultra Low-Bitrate Music Codec.yaml b/current/2024-09-23 MuCodec: Ultra Low-Bitrate Music Codec.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Yaoxun Xu
+title: 'MuCodec: Ultra Low-Bitrate Music Codec'
+thumbnail: ""
+link: https://huggingface.co/papers/2409.13216
+summary: MuCodec is a new music codec that can reconstruct high-quality music even at very low bitrates by using a combination of different techniques to extract and encode the music's acoustic and semantic features. This allows for better compression and reconstruction of both the music's vocals and backgrounds, leading to the best results so far in both subjective and objective metrics....
+opinion: placeholder
+tags:
+    - ML
diff --git a/current/2024-09-23 Portrait Video Editing Empowered by Multimodal Generative Priors.yaml b/current/2024-09-23 Portrait Video Editing Empowered by Multimodal Generative Priors.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Xuan Gao
+title: Portrait Video Editing Empowered by Multimodal Generative Priors
+thumbnail: ""
+link: https://huggingface.co/papers/2409.13591
+summary: PortraitGen is a powerful tool for editing portrait videos. It uses a 3D model and a special texture to make the editing look good and go fast. It also uses knowledge from large generative models to help with editing. The tool can handle many different types of editing, like changing the style or adding light, and it works well on videos. You can see examples of it in action on the project's website....
+opinion: placeholder
+tags:
+    - ML
diff --git a/current/2024-09-23 Prithvi WxC: Foundation Model for Weather and Climate.yaml b/current/2024-09-23 Prithvi WxC: Foundation Model for Weather and Climate.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Johannes Schmude
+title: 'Prithvi WxC: Foundation Model for Weather and Climate'
+thumbnail: ""
+link: https://huggingface.co/papers/2409.13598
+summary: Prithvi WxC is a 2.3 billion parameter model that uses AI to predict the weather. It can be used for different tasks such as forecasting, downscaling, or nowcasting. It was trained with 160 weather variables and can predict weather phenomena at fine resolutions. The model is open-source and can be used by anyone....
+opinion: placeholder
+tags:
+    - ML
diff --git a/current/2024-09-23 Temporally Aligned Audio for Video with Autoregression.yaml b/current/2024-09-23 Temporally Aligned Audio for Video with Autoregression.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Ilpo Viertola
+title: Temporally Aligned Audio for Video with Autoregression
+thumbnail: ""
+link: https://huggingface.co/papers/2409.13689
+summary: V-AURA is a new system that makes better sounds for videos by understanding what's happening in the video and making sure the sounds are happening at the right time. It uses a special way to look at the video and combine it with sounds, and they made a new group of videos to train it on. V-AURA is better than other systems at making sure the sounds match the video, but it still makes good sounds....
+opinion: placeholder
+tags:
+    - ML
diff --git a/...-09-23 V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians.yaml b/...-09-23 V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Penghao Wang
+title: 'V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians'
+thumbnail: ""
+link: https://huggingface.co/papers/2409.13648
+summary: V3 is a new way to watch high-quality 3D videos on mobile devices. It works by turning the videos into a format that can be streamed easily and played on regular video players. This makes it possible to watch 3D videos on mobile devices without any special hardware or software....
+opinion: placeholder
+tags:
+    - ML
diff --git a/...dal Dataset for evaluating Satire Comprehension capability of Vision-Language Models.yaml b/...dal Dataset for evaluating Satire Comprehension capability of Vision-Language Models.yaml
@@ -0,0 +1,9 @@
+date: "2024-09-23"
+author: Abhilash Nandy
+title: 'YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models'
+thumbnail: ""
+link: https://huggingface.co/papers/2409.13592
+summary: The paper introduces a new dataset called YesBut, which is specifically designed to evaluate the ability of Vision-Language models to understand satirical images. The dataset contains 2547 images, with half being satirical and half not, and is available for researchers to use. The paper also shows that current Vision-Language models struggle with the tasks in the YesBut dataset, highlighting the need for further research in this area....
+opinion: placeholder
+tags:
+    - ML