Skip to main content

Showing 1–50 of 638 results for author: Ye, X

  1. arXiv:2409.07979  [pdf, ps, other

    math.DS

    Multiple recurrence without commutativity

    Authors: Wen Huang, Song Shao, Xiangdong Ye

    Abstract: We study multiple recurrence without commutativity in this paper. We show that for any two homeomorphisms $T,S: X\rightarrow X$ with $(X,T)$ and $(X,S)$ being minimal, there is a residual subset $X_0$ of $X$ such that for any $x\in X_0$ and any nonlinear integral polynomials $p_1,\ldots, p_d$ vanishing at $0$, there is some subsequence $\{n_i\}$ of $\mathbb Z$ with $n_i\to \infty$ satisfying… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

    Comments: 40 pages. arXiv admin note: text overlap with arXiv:2301.07873; text overlap with arXiv:2405.11251 by other authors

  2. arXiv:2409.00633  [pdf, other

    cs.CV

    Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression

    Authors: Dingyuan Zhang, Dingkang Liang, Zichang Tan, Xiaoqing Ye, Cheng Zhang, Jingdong Wang, Xiang Bai

    Abstract: Slow inference speed is one of the most crucial concerns for deploying multi-view 3D detectors to tasks with high real-time requirements like autonomous driving. Although many sparse query-based methods have already attempted to improve the efficiency of 3D detectors, they neglect to consider the backbone, especially when using Vision Transformers (ViT) for better performance. To tackle this probl… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

    Comments: Accepted by ECCV 2024

  3. arXiv:2409.00494  [pdf, other

    cs.AI cs.SE

    GenAI-powered Multi-Agent Paradigm for Smart Urban Mobility: Opportunities and Challenges for Integrating Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) with Intelligent Transportation Systems

    Authors: Haowen Xu, Jinghui Yuan, Anye Zhou, Guanhao Xu, Wan Li, Xuegang Ban, Xinyue Ye

    Abstract: Leveraging recent advances in generative AI, multi-agent systems are increasingly being developed to enhance the functionality and efficiency of smart city applications. This paper explores the transformative potential of large language models (LLMs) and emerging Retrieval-Augmented Generation (RAG) technologies in Intelligent Transportation Systems (ITS), paving the way for innovative solutions t… ▽ More

    Submitted 4 September, 2024; v1 submitted 31 August, 2024; originally announced September 2024.

  4. arXiv:2408.17325  [pdf, other

    cs.CL cond-mat.dis-nn cond-mat.stat-mech

    Impact of ChatGPT on the writing style of condensed matter physicists

    Authors: Shaojun Xu, Xiaohui Ye, Mengqi Zhang, Pei Wang

    Abstract: We apply a state-of-the-art difference-in-differences approach to estimate the impact of ChatGPT's release on the writing style of condensed matter papers on arXiv. Our analysis reveals a statistically significant improvement in the English quality of abstracts written by non-native English speakers. Importantly, this improvement remains robust even after accounting for other potential factors, co… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: 9 pages, 1 figure, 7 tables

  5. arXiv:2408.16975  [pdf, other

    q-bio.BM cs.AI cs.LG

    Technical Report of HelixFold3 for Biomolecular Structure Prediction

    Authors: Lihang Liu, Shanzhuo Zhang, Yang Xue, Xianbin Ye, Kunrui Zhu, Yuxin Li, Yang Liu, Wenlai Zhao, Hongkun Yu, Zhihua Wu, Xiaonan Zhang, Xiaomin Fang

    Abstract: The AlphaFold series has transformed protein structure prediction with remarkable accuracy, often matching experimental methods. AlphaFold2, AlphaFold-Multimer, and the latest AlphaFold3 represent significant strides in predicting single protein chains, protein complexes, and biomolecular structures. While AlphaFold2 and AlphaFold-Multimer are open-sourced, facilitating rapid and reliable predicti… ▽ More

    Submitted 8 September, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

  6. arXiv:2408.16375  [pdf, other

    cs.RO

    EasyChauffeur: A Baseline Advancing Simplicity and Efficiency on Waymax

    Authors: Lingyu Xiao, Jiang-Jiang Liu, Xiaoqing Ye, Wankou Yang, Jingdong Wang

    Abstract: Recent advancements in deep-learning-based driving planners have primarily focused on elaborate network engineering, yielding limited improvements. This paper diverges from conventional approaches by exploring three fundamental yet underinvestigated aspects: training policy, data efficiency, and evaluation robustness. We introduce EasyChauffeur, a reproducible and effective planner for both imitat… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  7. arXiv:2408.15089  [pdf, other

    cs.AR cs.LG

    SiHGNN: Leveraging Properties of Semantic Graphs for Efficient HGNN Acceleration

    Authors: Runzhen Xue, Mingyu Yan, Dengke Han, Zhimin Tang, Xiaochun Ye, Dongrui Fan

    Abstract: Heterogeneous Graph Neural Networks (HGNNs) have expanded graph representation learning to heterogeneous graph fields. Recent studies have demonstrated their superior performance across various applications, including medical analysis and recommendation systems, often surpassing existing methods. However, GPUs often experience inefficiencies when executing HGNNs due to their unique and complex exe… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: 12 pages, 18 figures. arXiv admin note: text overlap with arXiv:2404.04792

  8. arXiv:2408.14438  [pdf, other

    cs.CL cs.CY

    Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study

    Authors: Liuchang Xu, Shuo Zhao, Qingming Lin, Luyao Chen, Qianqian Luo, Sensen Wu, Xinyue Ye, Hailin Feng, Zhenhong Du

    Abstract: The advent of large language models such as ChatGPT, Gemini, and others has underscored the importance of evaluating their diverse capabilities, ranging from natural language understanding to code generation. However, their performance on spatial tasks has not been comprehensively assessed. This study addresses this gap by introducing a novel multi-task spatial evaluation dataset, designed to syst… ▽ More

    Submitted 2 September, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

  9. arXiv:2408.08490  [pdf, other

    cs.AR

    Accelerating Mini-batch HGNN Training by Reducing CUDA Kernels

    Authors: Meng Wu, Jingkai Qiu, Mingyu Yan, Wenming Li, Yang Zhang, Zhimin Zhang, Xiaochun Ye, Dongrui Fan

    Abstract: Heterogeneous graph neural networks (HGNNs) are essential for capturing the structure and semantic information in heterogeneous graphs. However, existing GPU-based solutions, such as PyTorch Geometric, suffer from low GPU utilization due to numerous short-execution-time and memory-bound CUDA kernels during HGNN training. To address this issue, we introduce HiFuse, an enhancement for PyTorch Geom… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  10. arXiv:2408.06400  [pdf, other

    physics.ao-ph cs.LG

    MetMamba: Regional Weather Forecasting with Spatial-Temporal Mamba Model

    Authors: Haoyu Qin, Yungang Chen, Qianchuan Jiang, Pengchao Sun, Xiancai Ye, Chao Lin

    Abstract: Deep Learning based Weather Prediction (DLWP) models have been improving rapidly over the last few years, surpassing state of the art numerical weather forecasts by significant margins. While much of the optimization effort is focused on training curriculum to extend forecast range in the global context, two aspects remains less explored: limited area modeling and better backbones for weather fore… ▽ More

    Submitted 14 August, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

    Comments: Typo and grammar; Minor elaboration and clarifications; Use full organization name in the author section

  11. arXiv:2408.03353  [pdf, other

    cs.LG cs.AI cs.HC

    Adversarial Domain Adaptation for Cross-user Activity Recognition Using Diffusion-based Noise-centred Learning

    Authors: Xiaozhou Ye, Kevin I-Kai Wang

    Abstract: Human Activity Recognition (HAR) plays a crucial role in various applications such as human-computer interaction and healthcare monitoring. However, challenges persist in HAR models due to the data distribution differences between training and real-world data distributions, particularly evident in cross-user scenarios. This paper introduces a novel framework, termed Diffusion-based Noise-centered… ▽ More

    Submitted 31 August, 2024; v1 submitted 6 August, 2024; originally announced August 2024.

  12. arXiv:2408.01902  [pdf, other

    cs.AR

    A Comprehensive Survey on GNN Characterization

    Authors: Meng Wu, Mingyu Yan, Wenming Li, Xiaochun Ye, Dongrui Fan, Yuan Xie

    Abstract: Characterizing graph neural networks (GNNs) is essential for identifying performance bottlenecks and facilitating their deployment. Despite substantial work in this area, a comprehensive survey on GNN characterization is lacking. This work presents a comprehensive survey, proposing a triple-level classification method to categorize, summarize, and compare existing efforts. In addition, we identify… ▽ More

    Submitted 15 August, 2024; v1 submitted 3 August, 2024; originally announced August 2024.

  13. arXiv:2407.18456  [pdf, other

    physics.optics cs.CV

    Diffusion-driven lensless fiber endomicroscopic quantitative phase imaging towards digital pathology

    Authors: Zhaoqing Chen, Jiawei Sun, Xinyi Ye, Bin Zhao, Xuelong Li, Juergen Czarske

    Abstract: Lensless fiber endomicroscope is an emerging tool for in-vivo microscopic imaging, where quantitative phase imaging (QPI) can be utilized as a label-free method to enhance image contrast. However, existing single-shot phase reconstruction methods through lensless fiber endomicroscope typically perform well on simple images but struggle with complex microscopic structures. Here, we propose a speckl… ▽ More

    Submitted 13 September, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

  14. arXiv:2407.18232  [pdf, other

    cs.CV

    LION: Linear Group RNN for 3D Object Detection in Point Clouds

    Authors: Zhe Liu, Jinghua Hou, Xinyu Wang, Xiaoqing Ye, Jingdong Wang, Hengshuang Zhao, Xiang Bai

    Abstract: The benefit of transformers in large-scale 3D point cloud perception tasks, such as 3D object detection, is limited by their quadratic computation cost when modeling long-range relationships. In contrast, linear RNNs have low computational complexity and are suitable for long-range modeling. Toward this goal, we propose a simple and effective window-based framework built on LInear grOup RNN (i.e.,… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: Project page: https://happinesslz.github.io/projects/LION/

  15. arXiv:2407.16981  [pdf, other

    cs.CV cs.AI

    Case-Enhanced Vision Transformer: Improving Explanations of Image Similarity with a ViT-based Similarity Metric

    Authors: Ziwei Zhao, David Leake, Xiaomeng Ye, David Crandall

    Abstract: This short paper presents preliminary research on the Case-Enhanced Vision Transformer (CEViT), a similarity measurement method aimed at improving the explainability of similarity assessments for image data. Initial experimental results suggest that integrating CEViT into k-Nearest Neighbor (k-NN) classification yields classification accuracy comparable to state-of-the-art computer vision models,… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  16. arXiv:2407.15334  [pdf, other

    cs.CV

    Explore the LiDAR-Camera Dynamic Adjustment Fusion for 3D Object Detection

    Authors: Yiran Yang, Xu Gao, Tong Wang, Xin Hao, Yifeng Shi, Xiao Tan, Xiaoqing Ye, Jingdong Wang

    Abstract: Camera and LiDAR serve as informative sensors for accurate and robust autonomous driving systems. However, these sensors often exhibit heterogeneous natures, resulting in distributional modality gaps that present significant challenges for fusion. To address this, a robust fusion technique is crucial, particularly for enhancing 3D object detection. In this paper, we introduce a dynamic adjustment… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  17. arXiv:2407.12022   

    cs.CL cs.AI

    ITERTL: An Iterative Framework for Fine-tuning LLMs for RTL Code Generation

    Authors: Peiyang Wu, Nan Guo, Xiao Xiao, Wenming Li, Xiaochun Ye, Dongrui Fan

    Abstract: Recently, large language models (LLMs) have demonstrated excellent performance in understanding human instructions and generating code, which has inspired researchers to explore the feasibility of generating RTL code with LLMs. However, the existing approaches to fine-tune LLMs on RTL codes typically are conducted on fixed datasets, which do not fully stimulate the capability of LLMs and require l… ▽ More

    Submitted 23 July, 2024; v1 submitted 27 June, 2024; originally announced July 2024.

    Comments: There is some mistakes about the Experimental Setup in Section4.1

  18. arXiv:2407.11790  [pdf, other

    cs.LG cs.AI cs.AR cs.PF

    Characterizing and Understanding HGNN Training on GPUs

    Authors: Dengke Han, Mingyu Yan, Xiaochun Ye, Dongrui Fan

    Abstract: Owing to their remarkable representation capabilities for heterogeneous graph data, Heterogeneous Graph Neural Networks (HGNNs) have been widely adopted in many critical real-world domains such as recommendation systems and medical analysis. Prior to their practical application, identifying the optimal HGNN model parameters tailored to specific tasks through extensive training is a time-consuming… ▽ More

    Submitted 15 August, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: 23 pages, 14 figures, submitted to ACM TACO

  19. arXiv:2407.10753  [pdf, other

    cs.CV

    OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection

    Authors: Jinghua Hou, Tong Wang, Xiaoqing Ye, Zhe Liu, Shi Gong, Xiao Tan, Errui Ding, Jingdong Wang, Xiang Bai

    Abstract: Accurate depth information is crucial for enhancing the performance of multi-view 3D object detection. Despite the success of some existing multi-view 3D detectors utilizing pixel-wise depth supervision, they overlook two significant phenomena: 1) the depth supervision obtained from LiDAR points is usually distributed on the surface of the object, which is not so friendly to existing DETR-based 3D… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  20. arXiv:2407.10749  [pdf, other

    cs.CV

    SEED: A Simple and Effective 3D DETR in Point Clouds

    Authors: Zhe Liu, Jinghua Hou, Xiaoqing Ye, Tong Wang, Jingdong Wang, Xiang Bai

    Abstract: Recently, detection transformers (DETRs) have gradually taken a dominant position in 2D detection thanks to their elegant framework. However, DETR-based detectors for 3D point clouds are still difficult to achieve satisfactory performance. We argue that the main challenges are twofold: 1) How to obtain the appropriate object queries is challenging due to the high sparsity and uneven distribution o… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  21. arXiv:2407.10728  [pdf, ps, other

    math.DS

    A counterexample on multiple convergence without commutativity

    Authors: Wen Huang, Song Shao, Xiangdong Ye

    Abstract: It is shown that there exist a probability space $(X,{\mathcal X},μ)$, two ergodic measure preserving transformations $T,S$ acting on $(X,{\mathcal X},μ)$ with $h_μ(X,T)=h_μ(X,S)=0$, and $f, g \in L^\infty(X,μ)$ such that the limit \begin{equation*} \lim_{N\to\infty}\frac{1}{N}\sum_{n=0}^{N-1} f(T^{n}x)g(S^{n}x) \end{equation*} does not exist in $L^2(X,μ)$.

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 10 pages. arXiv admin note: substantial text overlap with arXiv:2301.12409

  22. arXiv:2407.06546  [pdf, other

    cs.CV cs.RO

    Exploring the Causality of End-to-End Autonomous Driving

    Authors: Jiankun Li, Hao Li, Jiangjiang Liu, Zhikang Zou, Xiaoqing Ye, Fan Wang, Jizhou Huang, Hua Wu, Haifeng Wang

    Abstract: Deep learning-based models are widely deployed in autonomous driving areas, especially the increasingly noticed end-to-end solutions. However, the black-box property of these models raises concerns about their trustworthiness and safety for autonomous driving, and how to debug the causality has become a pressing concern. Despite some existing research on the explainability of autonomous driving, t… ▽ More

    Submitted 19 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

  23. arXiv:2407.06249  [pdf, other

    cs.CL cs.SE

    CodeUpdateArena: Benchmarking Knowledge Editing on API Updates

    Authors: Zeyu Leo Liu, Shrey Pandit, Xi Ye, Eunsol Choi, Greg Durrett

    Abstract: Large language models (LLMs) are increasingly being used to synthesize and reason about source code. However, the static nature of these models' knowledge does not reflect the fact that libraries and API functions they invoke are continuously evolving, with functionality being added or changing. While numerous benchmarks evaluate how LLMs can generate code, no prior work has studied how an LLMs' k… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Under Review

  24. arXiv:2407.05679  [pdf, other

    cs.CV cs.AI

    BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space

    Authors: Yumeng Zhang, Shi Gong, Kaixin Xiong, Xiaoqing Ye, Xiao Tan, Fan Wang, Jizhou Huang, Hua Wu, Haifeng Wang

    Abstract: World models are receiving increasing attention in autonomous driving for their ability to predict potential future scenarios. In this paper, we present BEVWorld, a novel approach that tokenizes multimodal sensor inputs into a unified and compact Bird's Eye View (BEV) latent space for environment modeling. The world model consists of two parts: the multi-modal tokenizer and the latent BEV sequence… ▽ More

    Submitted 18 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 10 pages

  25. arXiv:2407.02214  [pdf

    physics.optics

    Enhanced Second-Harmonic Generation in Thin-Film Lithium Niobate Circular Bragg Nanocavity

    Authors: Zengya Li, Zhuoran Hu, Xiaona Ye, Zhengyang Mao, Juan Feng, Hao Li, Shijie Liu, Bo Wang, Yuanlin Zheng, Xianfeng Chen

    Abstract: Second-order nonlinearity gives rise to many distinctive physical phenomena, e.g., second-harmonic generation, which plays an important role in fundamental science and various applications. Lithium niobate, one of the most widely used nonlinear crystals, exhibits strong second-order nonlinear effects and electro-optic properties. However, its moderate refractive index and etching sidewall angle li… ▽ More

    Submitted 11 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: 19 pages, 5 figures

  26. arXiv:2407.02015  [pdf, other

    math.NA

    Robust First and Second-Order Differentiation for Regularized Optimal Transport

    Authors: Xingjie Li, Fei Lu, Molei Tao, Felix X. -F. Ye

    Abstract: Applications such as unbalanced and fully shuffled regression can be approached by optimizing regularized optimal transport (OT) distances, such as the entropic OT and Sinkhorn distances. A common approach for this optimization is to use a first-order optimizer, which requires the gradient of the OT distance. For faster convergence, one might also resort to a second-order optimizer, which addition… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    MSC Class: 68Q25; 68R10; 68U05

  27. arXiv:2407.01967  [pdf, other

    cs.CV

    Unleash the Power of Local Representations for Few-Shot Classification

    Authors: Shi Tang, Guiming Luo, Xinchen Ye, Zhiyi Xia

    Abstract: Generalizing to novel classes unseen during training is a key challenge of few-shot classification. Recent metric-based methods try to address this by local representations. However, they are unable to take full advantage of them due to (i) improper supervision for pretraining the feature extractor, and (ii) lack of adaptability in the metric for handling various possible compositions of local fea… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  28. arXiv:2407.01016  [pdf, other

    cs.CV

    SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection

    Authors: Dingkang Liang, Wei Hua, Chunsheng Shi, Zhikang Zou, Xiaoqing Ye, Xiang Bai

    Abstract: Semi-supervised object detection (SSOD), leveraging unlabeled data to boost object detectors, has become a hot topic recently. However, existing SSOD approaches mainly focus on horizontal objects, leaving multi-oriented objects common in aerial images unexplored. At the same time, the annotation cost of multi-oriented objects is significantly higher than that of their horizontal counterparts. Ther… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  29. arXiv:2406.18147  [pdf, ps, other

    math.DS

    Correlation entropy of free semigroup actions

    Authors: Xiaojiang Ye, Yanjie Tang, Dongkui Ma

    Abstract: This paper introduces the concepts of correlation entropy and local correlation entropy for free semigroup actions on compact metric space, and explores their fundamental properties. Thereafter, we generalize some classical results on correlation entropy and local correlation entropy to apply to free semigroup actions. Finally, we establish the relationship between topological entropy, measure-the… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 35 pages

  30. arXiv:2406.16486  [pdf, other

    cs.AI

    Towards Comprehensive Preference Data Collection for Reward Modeling

    Authors: Yulan Hu, Qingyang Li, Sheng Ouyang, Ge Chen, Kaihui Chen, Lijun Mei, Xucheng Ye, Fuzheng Zhang, Yong Liu

    Abstract: Reinforcement Learning from Human Feedback (RLHF) facilitates the alignment of large language models (LLMs) with human preferences, thereby enhancing the quality of responses generated. A critical component of RLHF is the reward model, which is trained on preference data and outputs a scalar reward during the inference stage. However, the collection of preference data still lacks thorough investig… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  31. arXiv:2406.15279  [pdf, other

    cs.AI cs.CL

    Cross-Modality Safety Alignment

    Authors: Siyin Wang, Xingsong Ye, Qinyuan Cheng, Junwen Duan, Shimin Li, Jinlan Fu, Xipeng Qiu, Xuanjing Huang

    Abstract: As Artificial General Intelligence (AGI) becomes increasingly integrated into various facets of human life, ensuring the safety and ethical alignment of such systems is paramount. Previous studies primarily focus on single-modality threats, which may not suffice given the integrated and complex nature of cross-modality interactions. We introduce a novel safety alignment challenge called Safe Input… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  32. arXiv:2406.13270  [pdf, other

    gr-qc hep-th

    Novel topological phenomena of timelike circular orbits for charged test particles

    Authors: Xu Ye, Shao-Wen Wei

    Abstract: The topological approach has recently been successfully employed to investigate timelike circular orbits for massive neutral test particles. The observed vanishing topological number implies that these timelike circular orbits occur in pairs. However, the behavior of charged test particles in this regard remains unexplored. To address this issue, our study focuses on examining the influence of par… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 30 pages, 29 figures

  33. arXiv:2406.13068  [pdf, other

    gr-qc hep-th

    A new energy inequality in AdS

    Authors: Gary T. Horowitz, Diandian Wang, Xiaohua Ye

    Abstract: We study time symmetric initial data for asymptotically AdS spacetimes with conformal boundary containing a spatial circle. Such $d$-dimensional initial data sets can contain $(d-2)$-dimensional minimal surfaces if the circle is contractible. We compute the minimum energy of a large class of such initial data as a function of the area $A$ of this minimal surface. The statement $E \ge E_{min}(A)$ i… ▽ More

    Submitted 15 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: 13 pages, 3 figs; v2: minor edits

  34. arXiv:2406.12275  [pdf, other

    cs.CV

    VoCo-LLaMA: Towards Vision Compression with Large Language Models

    Authors: Xubing Ye, Yukang Gan, Xiaoke Huang, Yixiao Ge, Ying Shan, Yansong Tang

    Abstract: Vision-Language Models (VLMs) have achieved remarkable success in various multi-modal tasks, but they are often bottlenecked by the limited context window and high computational cost of processing high-resolution image inputs and videos. Vision compression can alleviate this problem by reducing the vision token count. Previous approaches compress vision tokens with external modules and force LLMs… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 18 pages, 5 figures

  35. arXiv:2406.08785  [pdf, other

    cs.CV

    BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object Detection

    Authors: Wenjie Wang, Yehao Lu, Guangcong Zheng, Shuigen Zhan, Xiaoqing Ye, Zichang Tan, Jingdong Wang, Gaoang Wang, Xi Li

    Abstract: Vision-based roadside 3D object detection has attracted rising attention in autonomous driving domain, since it encompasses inherent advantages in reducing blind spots and expanding perception range. While previous work mainly focuses on accurately estimating depth or height for 2D-to-3D mapping, ignoring the position approximation error in the voxel pooling process. Inspired by this insight, we p… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  36. arXiv:2406.06062  [pdf, other

    cs.CV cs.AI

    ProcessPainter: Learn Painting Process from Sequence Data

    Authors: Yiren Song, Shijie Huang, Chen Yao, Xiaojun Ye, Hai Ci, Jiaming Liu, Yuxuan Zhang, Mike Zheng Shou

    Abstract: The painting process of artists is inherently stepwise and varies significantly among different painters and styles. Generating detailed, step-by-step painting processes is essential for art education and research, yet remains largely underexplored. Traditional stroke-based rendering methods break down images into sequences of brushstrokes, yet they fall short of replicating the authentic processe… ▽ More

    Submitted 20 July, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  37. arXiv:2406.03076  [pdf, ps, other

    math.DG

    Fourier integral operators on Hardy spaces with Hormander class

    Authors: Xiaofeng Ye, Chunjie Zhang, Xiangrong Zhu

    Abstract: In this note, we consider a Fourier integral operator defined by \begin{align*} T_{φ,a}f(x) = \int_{\mathbb{R}^{n}}e^{iφ(x,ξ)}a(x,ξ)\widehat{f} ξ)dξ, \end{align*}here $a$ is the amplitude, and $φ$ is the phase. Let $0\leqρ\leq 1,n\geq 2$ or $0\leqρ<1,n=1$ and $$m_p=\frac{ρ-n}{p}+(n-1)\min\{\frac 12,ρ\}.$$ If $a$ belongs to the forbidden Hörmander class $S^{m_p}_{ρ,1}$ and $φ\in Φ^{2}$ satisfies th… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 24 pages

    MSC Class: 42B20; 35S30

  38. arXiv:2406.02610  [pdf, other

    q-bio.QM cs.AI cs.LG

    MoFormer: Multi-objective Antimicrobial Peptide Generation Based on Conditional Transformer Joint Multi-modal Fusion Descriptor

    Authors: Li Wang, Xiangzheng Fu, Jiahao Yang, Xinyi Zhang, Xiucai Ye, Yiping Liu, Tetsuya Sakurai, Xiangxiang Zeng

    Abstract: Deep learning holds a big promise for optimizing existing peptides with more desirable properties, a critical step towards accelerating new drug discovery. Despite the recent emergence of several optimized Antimicrobial peptides(AMP) generation methods, multi-objective optimizations remain still quite challenging for the idealism-realism tradeoff. Here, we establish a multi-objective AMP synthesis… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  39. arXiv:2406.01563  [pdf, other

    cs.CL

    LoFiT: Localized Fine-tuning on LLM Representations

    Authors: Fangcong Yin, Xi Ye, Greg Durrett

    Abstract: Recent work in interpretability shows that large language models (LLMs) can be adapted for new tasks in a learning-free way: it is possible to intervene on LLM representations to elicit desired behaviors for alignment. For instance, adding certain bias vectors to the outputs of certain attention heads is reported to boost the truthfulness of models. In this work, we show that localized fine-tuning… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  40. arXiv:2406.00988  [pdf, other

    cs.AR

    ADE-HGNN: Accelerating HGNNs through Attention Disparity Exploitation

    Authors: Dengke Han, Meng Wu, Runzhen Xue, Mingyu Yan, Xiaochun Ye, Dongrui Fan

    Abstract: Heterogeneous Graph Neural Networks (HGNNs) have recently demonstrated great power in handling heterogeneous graph data, rendering them widely applied in many critical real-world domains. Most HGNN models leverage attention mechanisms to significantly improvemodel accuracy, albeit at the cost of increased computational complexity and memory bandwidth requirements. Fortunately, the attention dispar… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 15 pages, 9 figures, accepted by Euro-PAR 2024

  41. arXiv:2405.20714  [pdf

    cond-mat.mtrl-sci cond-mat.str-el

    Large low-field magnetocaloric response in a ferromagnetic gadolinium orthophosphate

    Authors: Ziyu W. Yang, Jie Zhang, Maocai Pi, Xubin Ye, Chenxu Kang, Xiaoliang Weng, Wei Tang, Hongzhi Cui, Yu-Jia Zeng, Youwen Long

    Abstract: Bulk magnetic and thermodynamic measurements, along with mean-field calculations, were conducted on the ferromagnetic K3Gd5(PO4)6 powders. No magnetic ordering was observed until 2 K, while the application of an external field B > 1 T resulted in the splitting of the Gd3+ ground state multiplet and induced a non-cooperative Schottky effect. The average nearest-neighbor exchange strength |J1/kB| is… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures

  42. arXiv:2405.20535  [pdf, other

    cs.AI cs.CL

    Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning

    Authors: Xinlu Zhang, Zhiyu Zoey Chen, Xi Ye, Xianjun Yang, Lichang Chen, William Yang Wang, Linda Ruth Petzold

    Abstract: Instruction Fine-Tuning (IFT) significantly enhances the zero-shot capabilities of pretrained Large Language Models (LLMs). While coding data is known to boost reasoning abilities during LLM pretraining, its role in activating internal reasoning capacities during IFT remains understudied. This paper investigates a key question: How does coding data impact LLMs' reasoning capacities during the IFT… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  43. arXiv:2405.17329  [pdf, other

    cs.IT eess.SP

    Joint MIMO Transceiver and Reflector Design for Reconfigurable Intelligent Surface-Assisted Communication

    Authors: Yaqiong Zhao, Jindan Xu, Wei Xu, Kezhi Wang, Xinquan Ye, Chau Yuen, Xiaohu You

    Abstract: In this paper, we consider a reconfigurable intelligent surface (RIS)-assisted multiple-input multiple-output communication system with multiple antennas at both the base station (BS) and the user. We plan to maximize the achievable rate through jointly optimizing the transmit precoding matrix, the receive combining matrix, and the RIS reflection matrix under the constraints of the transmit power… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 14 pages, 12 figures

  44. arXiv:2405.16923  [pdf, other

    cs.CV

    SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain

    Authors: Butian Xiong, Xiaoyu Ye, Tze Ho Elden Tse, Kai Han, Shuguang Cui, Zhen Li

    Abstract: With the emergence of Gaussian Splats, recent efforts have focused on large-scale scene geometric reconstruction. However, most of these efforts either concentrate on memory reduction or spatial space division, neglecting information in the semantic space. In this paper, we propose a novel method, named SA-GS, for fine-grained 3D geometry reconstruction using semantic-aware 3D Gaussian Splats. Spe… ▽ More

    Submitted 28 May, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Might need more comparison, will be add later

  45. Guaranteeing Accuracy and Fairness under Fluctuating User Traffic: A Bankruptcy-Inspired Re-ranking Approach

    Authors: Xiaopeng Ye, Chen Xu, Jun Xu, Xuyang Xie, Gang Wang, Zhenhua Dong

    Abstract: Out of sustainable and economical considerations, two-sided recommendation platforms must satisfy the needs of both users and providers. Previous studies often show that the two sides' needs show different urgency: providers need a relatively long-term exposure demand while users want more short-term and accurate service. However, our empirical study reveals that previous methods for trading off f… ▽ More

    Submitted 16 August, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  46. arXiv:2405.13124  [pdf, other

    astro-ph.GA astro-ph.SR

    The Pristine survey -- XXVI. The very metal-poor Galaxy: Chemodynamics through the follow-up of the Pristine-Gaia synthetic catalogue

    Authors: Akshara Viswanathan, Zhen Yuan, Anke Ardern-Arentsen, Else Starkenburg, Nicolas F. Martin, Kris Youakim, Rodrigo A. Ibata, Federico Sestito, Tadafumi Matsuno, Carlos Allende Prieto, Freya Barwell, Manuel Bayer, Amandine Doliva-Dolinsky, Emma Fernandez-Alvar, Pablo M. Galan-de Anta, Kiran Jhass, Nicolas Longeard, Jose Maria Arroyo-Polonio, Pol Massana, Martin Montelius, Samuel Rusterucci, Judith Santos, Guillaume F. Thomas, Sara Vitali, Wenbo Wu , et al. (5 additional authors not shown)

    Abstract: The Pristine-\textit{Gaia} synthetic catalogue provides reliable photometric metallicities for $\sim$30 million FGK stars using the Pristine survey model and Gaia XP spectra. We perform the first low-to-medium-resolution spectroscopic follow-up of bright (G<15) and distant (up to 35 kpc) very and extremely metal-poor (V/EMP, [Fe/H]<-2.5) red giant branch stars from this. We use Isaac Newton Telesc… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Submitted to A&A. 17 pages (9 figures) + 3 pages (3 figures) in Appendix. Comments are very welcome! The catalogue and 1D spectra will be made available public after acceptance and before upon reasonable request to the first author

  47. arXiv:2405.12218  [pdf, other

    cs.CV

    MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

    Authors: Tianqi Liu, Guangcong Wang, Shoukang Hu, Liao Shen, Xinyi Ye, Yuhang Zang, Zhiguo Cao, Wei Li, Ziwei Liu

    Abstract: We present MVSGaussian, a new generalizable 3D Gaussian representation approach derived from Multi-View Stereo (MVS) that can efficiently reconstruct unseen scenes. Specifically, 1) we leverage MVS to encode geometry-aware Gaussian representations and decode them into Gaussian parameters. 2) To further enhance performance, we propose a hybrid Gaussian rendering that integrates an efficient volume… ▽ More

    Submitted 15 July, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: ECCV2024, Project page: https://mvsgaussian.github.io/ , Code: https://github.com/TQTQliu/MVSGaussian

  48. arXiv:2405.11251  [pdf, ps, other

    math.DS

    A refined saturation theorem for polynomials and applications

    Authors: Xiangdong Ye, Jiaqi Yu

    Abstract: For a dynamical system $(X,T)$, $d\in\mathbb{N}$ and distinct non-constant integral polynomials $p_1,\ldots, p_d$ vanishing at $0$, the notion of regionally proximal relation along $C=\{p_1,\ldots,p_d\}$ (denoted by $RP_C^{[d]}(X,T)$) is introduced. It turns out that for a minimal system, $RP_C^{[d]}(X,T)=Δ$ implies that $X$ is an almost one-to-one extension of $X_k$ for some $k\in\mathbb{N}$ on… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  49. arXiv:2405.06871  [pdf, other

    math.NA math.PR

    Statistical Error of Numerical Integrators for Underdamped Langevin Dynamics with Deterministic And Stochastic Gradients

    Authors: Xuda Ye, Zhennan Zhou

    Abstract: We propose a novel discrete Poisson equation approach to estimate the statistical error of a broad class of numerical integrators for the underdamped Langevin dynamics. The statistical error refers to the mean square error of the estimator to the exact ensemble average with a finite number of iterations. With the proposed error analysis framework, we show that when the potential function $U(x)$ is… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    MSC Class: 60H35; 37M05

  50. arXiv:2405.06247  [pdf, other

    cs.LG cs.AI cs.CR

    Disttack: Graph Adversarial Attacks Toward Distributed GNN Training

    Authors: Yuxiang Zhang, Xin Liu, Meng Wu, Wei Yan, Mingyu Yan, Xiaochun Ye, Dongrui Fan

    Abstract: Graph Neural Networks (GNNs) have emerged as potent models for graph learning. Distributing the training process across multiple computing nodes is the most promising solution to address the challenges of ever-growing real-world graphs. However, current adversarial attack methods on GNNs neglect the characteristics and applications of the distributed scenario, leading to suboptimal performance and… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted by 30th International European Conference on Parallel and Distributed Computing(Euro-Par 2024)