-
Safety-critical Locomotion of Biped Robots in Infeasible Paths: Overcoming Obstacles during Navigation toward Destination
Authors:
Jaemin Lee,
Min Dai,
Jeeseop Kim,
Aaron D. Ames
Abstract:
This paper proposes a safety-critical locomotion control framework employed for legged robots exploring through infeasible path in obstacle-rich environments. Our research focus is on achieving safe and robust locomotion where robots confront unavoidable obstacles en route to their designated destination. Through the utilization of outcomes from physical interactions with unknown objects, we estab…
▽ More
This paper proposes a safety-critical locomotion control framework employed for legged robots exploring through infeasible path in obstacle-rich environments. Our research focus is on achieving safe and robust locomotion where robots confront unavoidable obstacles en route to their designated destination. Through the utilization of outcomes from physical interactions with unknown objects, we establish a hierarchy among the safety-critical conditions avoiding the obstacles. This hierarchy enables the generation of a safe reference trajectory that adeptly mitigates conflicts among safety conditions and reduce the risk while controlling the robot toward its destination without additional motion planning methods. In addition, robust bipedal locomotion is achieved by utilizing the Hybrid Linear Inverted Pendulum model, coupled with a disturbance observer addressing a disturbance from the physical interaction.
△ Less
Submitted 16 September, 2024;
originally announced September 2024.
-
Chalcogenide Metasurfaces Enabling Ultra-Wideband Detectors from Visible to Mid-infrared
Authors:
Shutao Zhang,
Shu An,
Mingjin Dai,
Qing Yang Steve Wu,
Nur Qalishah Adanan,
Jun Zhang,
Yan Liu,
Henry Yit Loong Lee,
Nancy Lai Mun Wong,
Ady Suwardi,
Jun Ding,
Robert Edward Simpson,
Qi Jie Wang,
Joel K. W. Yang,
Zhaogang Dong
Abstract:
Thermoelectric materials can be designed to support optical resonances across multiple spectral ranges to enable ultra-wide band photodetection. For instance, antimony telluride (Sb2Te3) chalcogenide exhibits interband plasmonic resonances in the visible range and Mie resonances in the mid-infrared (mid-IR) range, while simultaneously possessing large thermoelectric Seebeck coefficients. In this p…
▽ More
Thermoelectric materials can be designed to support optical resonances across multiple spectral ranges to enable ultra-wide band photodetection. For instance, antimony telluride (Sb2Te3) chalcogenide exhibits interband plasmonic resonances in the visible range and Mie resonances in the mid-infrared (mid-IR) range, while simultaneously possessing large thermoelectric Seebeck coefficients. In this paper, we designed and fabricated Sb2Te3 metasurface devices to achieve resonant absorption for enabling photodetectors operating across an ultra-wideband spectrum, from visible to mid-IR. Furthermore, relying on asymmetric Sb2Te3 metasurface, we demonstrated the thermoelectric photodetectors with polarization-selectivity. This work provides a potential platform towards the portable ultrawide band spectrometers at room temperature, for environmental sensing applications.
△ Less
Submitted 7 September, 2024;
originally announced September 2024.
-
Learning to Optimally Stop Diffusion Processes, with Financial Applications
Authors:
Min Dai,
Yu Sun,
Zuo Quan Xu,
Xun Yu Zhou
Abstract:
We study optimal stopping for diffusion processes with unknown model primitives within the continuous-time reinforcement learning (RL) framework developed by Wang et al. (2020), and present applications to option pricing and portfolio choice. By penalizing the corresponding variational inequality formulation, we transform the stopping problem into a stochastic optimal control problem with two acti…
▽ More
We study optimal stopping for diffusion processes with unknown model primitives within the continuous-time reinforcement learning (RL) framework developed by Wang et al. (2020), and present applications to option pricing and portfolio choice. By penalizing the corresponding variational inequality formulation, we transform the stopping problem into a stochastic optimal control problem with two actions. We then randomize controls into Bernoulli distributions and add an entropy regularizer to encourage exploration. We derive a semi-analytical optimal Bernoulli distribution, based on which we devise RL algorithms using the martingale approach established in Jia and Zhou (2022a), and prove a policy improvement theorem. We demonstrate the effectiveness of the algorithms in pricing finite-horizon American put options and in solving Merton's problem with transaction costs, and show that both the offline and online algorithms achieve high accuracy in learning the value functions and characterizing the associated free boundaries.
△ Less
Submitted 8 September, 2024; v1 submitted 17 August, 2024;
originally announced August 2024.
-
Dynamical Dark Energy from Lattice Quantum Gravity
Authors:
Mingwei Dai,
Walter Freeman,
Jack Laiho,
Marc Schiffer,
Judah Unmuth-Yockey
Abstract:
We study the behavior of the vacuum in Euclidean dynamical triangulations (EDT). Algorithmic improvements and better lattice spacing determinations allow us to test the properties of the emergent de Sitter geometries of our simulations to higher precision than previously possible. Although the agreement with de Sitter is good, the improved precision reveals deviations that can be interpreted as no…
▽ More
We study the behavior of the vacuum in Euclidean dynamical triangulations (EDT). Algorithmic improvements and better lattice spacing determinations allow us to test the properties of the emergent de Sitter geometries of our simulations to higher precision than previously possible. Although the agreement with de Sitter is good, the improved precision reveals deviations that can be interpreted as non-trivial vacuum dynamics, well-described by a cosmological constant that runs with scale. The simulations show that the dominant running is quadratic and that the scale can be identified with the Hubble rate. Several key cross-checks support this picture, including consistent results across multiple lattice spacings and the fact that covariant energy conservation is maintained. The parameters of the running are fully determined by simulations, enabling predictions when extrapolated to the scales relevant for our universe. This leads to a model for dark energy that is compatible with current observations, but which predicts deviations from $Λ$CDM at the ${\cal O}(10^{-3})$ level in cosmological observables that could be tested with future improvements in precision measurements.
△ Less
Submitted 16 August, 2024;
originally announced August 2024.
-
A Survey on Benchmarks of Multimodal Large Language Models
Authors:
Jian Li,
Weiheng Lu,
Hao Fei,
Meng Luo,
Ming Dai,
Min Xia,
Yizhang Jin,
Zhenye Gan,
Ding Qi,
Chaoyou Fu,
Ying Tai,
Wankou Yang,
Yabiao Wang,
Chengjie Wang
Abstract:
Multimodal Large Language Models (MLLMs) are gaining increasing popularity in both academia and industry due to their remarkable performance in various applications such as visual question answering, visual perception, understanding, and reasoning. Over the past few years, significant efforts have been made to examine MLLMs from multiple perspectives. This paper presents a comprehensive review of…
▽ More
Multimodal Large Language Models (MLLMs) are gaining increasing popularity in both academia and industry due to their remarkable performance in various applications such as visual question answering, visual perception, understanding, and reasoning. Over the past few years, significant efforts have been made to examine MLLMs from multiple perspectives. This paper presents a comprehensive review of 200 benchmarks and evaluations for MLLMs, focusing on (1)perception and understanding, (2)cognition and reasoning, (3)specific domains, (4)key capabilities, and (5)other modalities. Finally, we discuss the limitations of the current evaluation methods for MLLMs and explore promising future directions. Our key argument is that evaluation should be regarded as a crucial discipline to support the development of MLLMs better. For more details, please visit our GitHub repository: https://github.com/swordlidev/Evaluation-Multimodal-LLMs-Survey.
△ Less
Submitted 6 September, 2024; v1 submitted 16 August, 2024;
originally announced August 2024.
-
Deep State-Space Generative Model For Correlated Time-to-Event Predictions
Authors:
Yuan Xue,
Denny Zhou,
Nan Du,
Andrew M. Dai,
Zhen Xu,
Kun Zhang,
Claire Cui
Abstract:
Capturing the inter-dependencies among multiple types of clinically-critical events is critical not only to accurate future event prediction, but also to better treatment planning. In this work, we propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events (e.g., kidney failure, mortality) by explicitly modeling the temporal d…
▽ More
Capturing the inter-dependencies among multiple types of clinically-critical events is critical not only to accurate future event prediction, but also to better treatment planning. In this work, we propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events (e.g., kidney failure, mortality) by explicitly modeling the temporal dynamics of patients' latent states. Based on these learned patient states, we further develop a new general discrete-time formulation of the hazard rate function to estimate the survival distribution of patients with significantly improved accuracy. Extensive evaluations over real EMR data show that our proposed model compares favorably to various state-of-the-art baselines. Furthermore, our method also uncovers meaningful insights about the latent correlations among mortality and different types of organ failures.
△ Less
Submitted 27 July, 2024;
originally announced July 2024.
-
Learning to Select the Best Forecasting Tasks for Clinical Outcome Prediction
Authors:
Yuan Xue,
Nan Du,
Anne Mottram,
Martin Seneviratne,
Andrew M. Dai
Abstract:
We propose to meta-learn an a self-supervised patient trajectory forecast learning rule by meta-training on a meta-objective that directly optimizes the utility of the patient representation over the subsequent clinical outcome prediction. This meta-objective directly targets the usefulness of a representation generated from unlabeled clinical measurement forecast for later supervised tasks.
The…
▽ More
We propose to meta-learn an a self-supervised patient trajectory forecast learning rule by meta-training on a meta-objective that directly optimizes the utility of the patient representation over the subsequent clinical outcome prediction. This meta-objective directly targets the usefulness of a representation generated from unlabeled clinical measurement forecast for later supervised tasks.
The meta-learned can then be directly used in target risk prediction, and the limited available samples can be used for further fine-tuning the model performance. The effectiveness of our approach is tested on a real open source patient EHR dataset MIMIC-III. We are able to demonstrate that our attention-based patient state representation approach can achieve much better performance for predicting target risk with low resources comparing with both direct supervised learning and pretraining with all-observation trajectory forecast.
△ Less
Submitted 27 July, 2024;
originally announced July 2024.
-
Beale--Kato--Majda-type continuation criteria for Hall- and electron-magnetohydrodynamics
Authors:
Mimi Dai,
Sung-Jin Oh
Abstract:
We show that regular solutions to electron-MHD with resistivity can be continued as long as the time integral of the supremum of the current gradient remains finite. This dimensionless continuation criterion is analogous to the celebrated result of Beale--Kato--Majda for the incompressible Euler and Navier--Stokes equations. A similar continuation criterion, formulated in terms of the time integra…
▽ More
We show that regular solutions to electron-MHD with resistivity can be continued as long as the time integral of the supremum of the current gradient remains finite. This dimensionless continuation criterion is analogous to the celebrated result of Beale--Kato--Majda for the incompressible Euler and Navier--Stokes equations. A similar continuation criterion, formulated in terms of the time integral of the supremum of the vorticity, velocity gradient and current gradient, is established for the Hall-MHD with resistivity as well.
△ Less
Submitted 12 September, 2024; v1 submitted 5 July, 2024;
originally announced July 2024.
-
An Onsager-type theorem for SQG
Authors:
Mimi Dai,
Vikram Giri,
Razvan-Octavian Radu
Abstract:
We construct non-trivial weak solutions $θ\in C_t^0C_x^{0-}$ to the surface quasi-geostrophic (SQG) equations, which have compact support in time and, thus, violate the conservation of the Hamiltonian. The result is sharp in view of the fact that such a conservation law holds for all weak solutions in the class $C_{t,x}^0 \subset L_{t,x}^3$ (Isett-Vicol, 2015) and resolves the Onsager conjecture f…
▽ More
We construct non-trivial weak solutions $θ\in C_t^0C_x^{0-}$ to the surface quasi-geostrophic (SQG) equations, which have compact support in time and, thus, violate the conservation of the Hamiltonian. The result is sharp in view of the fact that such a conservation law holds for all weak solutions in the class $C_{t,x}^0 \subset L_{t,x}^3$ (Isett-Vicol, 2015) and resolves the Onsager conjecture for SQG. The construction is achieved by means of a Nash iteration together with the linear decoupling method recently introduced in Giri-Radu (2023).
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Mode-Locked Fiber Laser with up to 19 kHz Wavelength Sweep Rate via External Pump LD Modulation
Authors:
Guanyu Ye,
Maolin Dai,
Bowen Liu,
Yifan Ma,
Takuma Shirahata,
Shinji Yamashita,
Sze Yun Set
Abstract:
For the first time, we introduce a rapid wavelength-swept, passively mode-locked fiber laser in an all-polarization-maintaining and all-fiber configuration. Achieving an exceptional wavelength sweep rate of up to 19 kHz through external modulation of the LD driver pump current, this laser offers a high sweep rate, simple cavity design, cost-effectiveness, and excellent repeatability.
For the first time, we introduce a rapid wavelength-swept, passively mode-locked fiber laser in an all-polarization-maintaining and all-fiber configuration. Achieving an exceptional wavelength sweep rate of up to 19 kHz through external modulation of the LD driver pump current, this laser offers a high sweep rate, simple cavity design, cost-effectiveness, and excellent repeatability.
△ Less
Submitted 18 May, 2024;
originally announced June 2024.
-
User Association and Channel Allocation in 5G Mobile Asymmetric Multi-band Heterogeneous Networks
Authors:
Miao Dai,
Gang Sun,
Hongfang Yu,
Sheng Wang,
Dusit Niyato
Abstract:
With the proliferation of mobile terminals and the continuous upgrading of services, 4G LTE networks are showing signs of weakness. To enhance the capacity of wireless networks, millimeter waves are introduced to drive the evolution of networks towards multi-band 5G heterogeneous networks. The distinct propagation characteristics of mmWaves and microwaves, as well as the vastly different hardware…
▽ More
With the proliferation of mobile terminals and the continuous upgrading of services, 4G LTE networks are showing signs of weakness. To enhance the capacity of wireless networks, millimeter waves are introduced to drive the evolution of networks towards multi-band 5G heterogeneous networks. The distinct propagation characteristics of mmWaves and microwaves, as well as the vastly different hardware configurations of heterogeneous base stations, make traditional access strategies no longer effective. Therefore, to narrowing the gap between theory and practice, we investigate the access strategy in multi-band 5G heterogeneous networks, taking into account the characteristics of mobile users, asynchronous switching between uplink and downlink of pico base stations, asymmetric service requirements, and user communication continuity. We formulate the problem as integer nonlinear programming and prove its intractability. Thereby, we decouple it into three subproblems: user association, switch point selection, and subchannel allocation, and design an algorithm based on optimal matching and spectral clustering to solve it efficiently. The simulation results show that the proposed algorithm outperforms the comparison methods in terms of overall data rate, effective data rate, and number of satisfied users.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Non-unique solutions for electron MHD
Authors:
Mimi Dai
Abstract:
We consider the electron magnetohydrodynamics (MHD) equation on the 3D torus $\mathbb T^3$. For a given smooth vector field $H$ with zero mean and zero divergence, we can construct a weak solution $B$ to the electron MHD in the space $L^γ_tW^{1,p}_x$ for appropriate $(γ, p)$ such that $B$ is arbitrarily close to $H$ in this space. The parameters $γ$ and $p$ depend on the resistivity. As a conseque…
▽ More
We consider the electron magnetohydrodynamics (MHD) equation on the 3D torus $\mathbb T^3$. For a given smooth vector field $H$ with zero mean and zero divergence, we can construct a weak solution $B$ to the electron MHD in the space $L^γ_tW^{1,p}_x$ for appropriate $(γ, p)$ such that $B$ is arbitrarily close to $H$ in this space. The parameters $γ$ and $p$ depend on the resistivity. As a consequence, non-uniqueness of weak solutions is obtained for the electron MHD with hyper-resistivity. In particular, non-Leray-Hopf solutions can be constructed. As a byproduct, we also show the existence of weak solutions to the electron MHD without resistivity.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Self-similar singularities for electron MHD
Authors:
Mimi Dai,
Hannah Guerra,
Chao Wu
Abstract:
We study several types of self-similar solutions for the electron magnetohydrodynamics (MHD) without resistivity, including locally self-similar solutions and pseudo-self-similar solutions. We show that under certain conditions, these types of self-similar blowup solutions can be excluded.
We study several types of self-similar solutions for the electron magnetohydrodynamics (MHD) without resistivity, including locally self-similar solutions and pseudo-self-similar solutions. We show that under certain conditions, these types of self-similar blowup solutions can be excluded.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Rapid-scanned and self-corrected repetition rates enabled in a bidirectional polarization-multiplexed fiber laser
Authors:
Bowen Liu,
Maolin Dai,
Takuma Shirahata,
Yifan Ma,
Shinji Yamashita,
Sze Yun Set
Abstract:
Repetition-rate-scanned lasers are practical in accordion frequency comb generation that serves as a variable gearbox connecting optical and radio wave domains. Rapid and wide-range scanned repetition rate can benefit versatile purposes, however scanning robustness remains unsecured that typically requires complicated feedback loops. Recently, multiplexed lasers have been demonstrated with the nat…
▽ More
Repetition-rate-scanned lasers are practical in accordion frequency comb generation that serves as a variable gearbox connecting optical and radio wave domains. Rapid and wide-range scanned repetition rate can benefit versatile purposes, however scanning robustness remains unsecured that typically requires complicated feedback loops. Recently, multiplexed lasers have been demonstrated with the nature of common-noise rejection among simultaneously emitted combs. Here, we propose a bidirectional polarization-multiplexed fiber laser that delivers synchronized pulses with rapid-scanned and reference-free repetition rates. Benefiting from the all polarization-maintaining fiber configuration, the laser shows good robustness and inter-comb coherence. As rapid as 493.5 kHz/s scanning rate over 329-kHz scanning range of fundamental repetition rate is realized. The 1-hour and 1-day maximal variations of difference frequency are merely 0.52 Hz and 5.46 Hz. The capability to rebuilt steady state after mode hopping is also demonstrated. These results provide a promising solution for developing high-performance accordion-frequency laser sources.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Tailoring Generative Adversarial Networks for Smooth Airfoil Design
Authors:
Joyjit Chattoraj,
Jian Cheng Wong,
Zhang Zexuan,
Manna Dai,
Xia Yingzhi,
Li Jichao,
Xu Xinxing,
Ooi Chin Chun,
Yang Feng,
Dao My Ha,
Liu Yong
Abstract:
In the realm of aerospace design, achieving smooth curves is paramount, particularly when crafting objects such as airfoils. Generative Adversarial Network (GAN), a widely employed generative AI technique, has proven instrumental in synthesizing airfoil designs. However, a common limitation of GAN is the inherent lack of smoothness in the generated airfoil surfaces. To address this issue, we prese…
▽ More
In the realm of aerospace design, achieving smooth curves is paramount, particularly when crafting objects such as airfoils. Generative Adversarial Network (GAN), a widely employed generative AI technique, has proven instrumental in synthesizing airfoil designs. However, a common limitation of GAN is the inherent lack of smoothness in the generated airfoil surfaces. To address this issue, we present a GAN model featuring a customized loss function built to produce seamlessly contoured airfoil designs. Additionally, our model demonstrates a substantial increase in design diversity compared to a conventional GAN augmented with a post-processing smoothing filter.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
783-MHz fundamental repetition rate all-fiber ring laser mode-locked by carbon nanotubes
Authors:
Maolin Dai,
Bowen Liu,
Yifan Ma,
Takuma Shirahata,
Ruoao Yang,
Zhigang Zhang,
Sze Yun Set,
Shinji Yamashita
Abstract:
We demonstrate a 783-MHz fundamental repetition rate mode-locked Er-doped all-fiber ring laser with a pulse width of 623 fs. By using carbon nanotubes (CNT) saturable absorber (SA), a relatively low self-starting pump threshold of 108 mW is achieved. The laser has a very compact footprint less than 10 cm * 10 cm, benefiting from the all-active-fiber cavity design. The robust mode-locking is confir…
▽ More
We demonstrate a 783-MHz fundamental repetition rate mode-locked Er-doped all-fiber ring laser with a pulse width of 623 fs. By using carbon nanotubes (CNT) saturable absorber (SA), a relatively low self-starting pump threshold of 108 mW is achieved. The laser has a very compact footprint less than 10 cm * 10 cm, benefiting from the all-active-fiber cavity design. The robust mode-locking is confirmed by the low relative intensity noise (RIN) and a long-term stability test. We propose a new scheme for generating high repetition rate femtosecond optical pulses from a compact and stable all-active-fiber ring oscillator.
△ Less
Submitted 28 May, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
Best Practices and Lessons Learned on Synthetic Data
Authors:
Ruibo Liu,
Jerry Wei,
Fangyu Liu,
Chenglei Si,
Yanzhe Zhang,
Jinmeng Rao,
Steven Zheng,
Daiyi Peng,
Diyi Yang,
Denny Zhou,
Andrew M. Dai
Abstract:
The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns. This paper provides an overview of synthetic data research, discussing its applications, challeng…
▽ More
The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns. This paper provides an overview of synthetic data research, discussing its applications, challenges, and future directions. We present empirical evidence from prior art to demonstrate its effectiveness and highlight the importance of ensuring its factuality, fidelity, and unbiasedness. We emphasize the need for responsible use of synthetic data to build more powerful, inclusive, and trustworthy language models.
△ Less
Submitted 10 August, 2024; v1 submitted 11 April, 2024;
originally announced April 2024.
-
Allo: A Programming Model for Composable Accelerator Design
Authors:
Hongzheng Chen,
Niansong Zhang,
Shaojie Xiang,
Zhichen Zeng,
Mengjia Dai,
Zhiru Zhang
Abstract:
Special-purpose hardware accelerators are increasingly pivotal for sustaining performance improvements in emerging applications, especially as the benefits of technology scaling continue to diminish. However, designers currently lack effective tools and methodologies to construct complex, high-performance accelerator architectures in a productive manner. Existing high-level synthesis (HLS) tools o…
▽ More
Special-purpose hardware accelerators are increasingly pivotal for sustaining performance improvements in emerging applications, especially as the benefits of technology scaling continue to diminish. However, designers currently lack effective tools and methodologies to construct complex, high-performance accelerator architectures in a productive manner. Existing high-level synthesis (HLS) tools often require intrusive source-level changes to attain satisfactory quality of results. Despite the introduction of several new accelerator design languages (ADLs) aiming to enhance or replace HLS, their advantages are more evident in relatively simple applications with a single kernel. Existing ADLs prove less effective for realistic hierarchical designs with multiple kernels, even if the design hierarchy is flattened.
In this paper, we introduce Allo, a composable programming model for efficient spatial accelerator design. Allo decouples hardware customizations, including compute, memory, communication, and data type from algorithm specification, and encapsulates them as a set of customization primitives. Allo preserves the hierarchical structure of an input program by combining customizations from different functions in a bottom-up, type-safe manner. This approach facilitates holistic optimizations that span across function boundaries. We conduct comprehensive experiments on commonly-used HLS benchmarks and several realistic deep learning models. Our evaluation shows that Allo can outperform state-of-the-art HLS tools and ADLs on all test cases in the PolyBench. For the GPT2 model, the inference latency of the Allo generated accelerator is 1.7x faster than the NVIDIA A100 GPU with 5.4x higher energy efficiency, demonstrating the capability of Allo to handle large-scale designs.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Non-unique stationary solutions of even active scalar equations
Authors:
Mimi Dai,
Chao Wu
Abstract:
We study a class of active scalar equations with even non-local operator in the drift term. Non-trivial stationary weak solutions in the space $C^{0-}$ are constructed using the iterative convex integration approach.
We study a class of active scalar equations with even non-local operator in the drift term. Non-trivial stationary weak solutions in the space $C^{0-}$ are constructed using the iterative convex integration approach.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
OS-FPI: A Coarse-to-Fine One-Stream Network for UAV Geo-Localization
Authors:
Jiahao Chen,
Enhui Zheng,
Ming Dai,
Yifu Chen,
Yusheng Lu
Abstract:
The geo-localization and navigation technology of unmanned aerial vehicles (UAVs) in denied environments is currently a prominent research area. Prior approaches mainly employed a two-stream network with non-shared weights to extract features from UAV and satellite images separately, followed by related modeling to obtain the response map. However, the two-stream network extracts UAV and satellite…
▽ More
The geo-localization and navigation technology of unmanned aerial vehicles (UAVs) in denied environments is currently a prominent research area. Prior approaches mainly employed a two-stream network with non-shared weights to extract features from UAV and satellite images separately, followed by related modeling to obtain the response map. However, the two-stream network extracts UAV and satellite features independently. This approach significantly affects the efficiency of feature extraction and increases the computational load. To address these issues, we propose a novel coarse-to-fine one-stream network (OS-FPI). Our approach allows information exchange between UAV and satellite features during early image feature extraction. To improve the model's performance, the framework retains feature maps generated at different stages of the feature extraction process for the feature fusion network, and establishes additional connections between UAV and satellite feature maps in the feature fusion network. Additionally, the framework introduces offset prediction to further refine and optimize the model's prediction results based on the classification tasks. Our proposed model, boasts a similar inference speed to FPI while significantly reducing the number of parameters. It can achieve better performance with fewer parameters under the same conditions. Moreover, it achieves state-of-the-art performance on the UL14 dataset. Compared to previous models, our model achieved a significant 10.92-point improvement on the RDS metric, reaching 76.25. Furthermore, its performance in meter-level localization accuracy is impressive, with 182.62% improvement in 3-meter accuracy, 164.17% improvement in 5-meter accuracy, and 137.43% improvement in 10-meter accuracy.
△ Less
Submitted 10 March, 2024;
originally announced March 2024.
-
Two mass-imbalanced atoms in a hard-wall trap: Deep learning integrability of many-body systems
Authors:
Liheng Lang,
Qichen Lu,
C. M. Dai,
Xingbo Wei,
Yanxia Liu,
Yunbo Zhang
Abstract:
The study of integrable systems has led to significant advancements in our understanding of many-body physics. We design a series of numerical experiments to analyze the integrability of a mass-imbalanced two-body system through energy level statistics and deep learning of wavefunctions. The level spacing distributions are fitted by a Brody distribution and the fitting parameter $ω$ is found to se…
▽ More
The study of integrable systems has led to significant advancements in our understanding of many-body physics. We design a series of numerical experiments to analyze the integrability of a mass-imbalanced two-body system through energy level statistics and deep learning of wavefunctions. The level spacing distributions are fitted by a Brody distribution and the fitting parameter $ω$ is found to separate the integrable and non-integrable mass ratios by a critical line $ω=0$. The convolutional neural network built from the probability density images could identify the transition points between integrable and non-integrable systems with high accuracy, yet in a much shorter computation time. A brilliant example of the network's ability is to identify a new integrable mass ratio $1/3$ by learning from the known integrable case of equal mass, with a remarkable network confidence of $98.78\%$. The robustness of our neural networks is further enhanced by adversarial learning, where samples are generated by standard and quantum perturbations mixed in the probability density images and the wavefunctions, respectively.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Pump-power-controlled L-band wavelength-tunable mode-locked fiber laser utilizing all polarization maintaining nonlinear polarization rotation
Authors:
Guanyu Ye,
Bowen Liu,
Maolin Dai,
Yifan Ma,
Takuma Shirahata,
Shinji Yamashita,
Sze Yun Set
Abstract:
For the first time, we present the pump power-controlled wavelength-tunable mode-locked fiber laser in the L-band (1565 nm to 1625 nm), achieved by all-polarization maintaining (all-PM) nonlinear polarization rotation (NPR). The wavelength of the laser can be tuned over 20 nm, from 1568.2 nm to 1588.9 nm simply by controlling the pump power from 45 mW to 115 mW. In contrast to conventional wavelen…
▽ More
For the first time, we present the pump power-controlled wavelength-tunable mode-locked fiber laser in the L-band (1565 nm to 1625 nm), achieved by all-polarization maintaining (all-PM) nonlinear polarization rotation (NPR). The wavelength of the laser can be tuned over 20 nm, from 1568.2 nm to 1588.9 nm simply by controlling the pump power from 45 mW to 115 mW. In contrast to conventional wavelength tuning mechanisms such as optical bandpass filters, our tuning method is non-mechanical and electrically controllable, featuring simplicity and cost-effectiveness in a superior all-fiber design.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
GeoDecoder: Empowering Multimodal Map Understanding
Authors:
Feng Qi,
Mian Dai,
Zixian Zheng,
Chao Wang
Abstract:
This paper presents GeoDecoder, a dedicated multimodal model designed for processing geospatial information in maps. Built on the BeitGPT architecture, GeoDecoder incorporates specialized expert modules for image and text processing. On the image side, GeoDecoder utilizes GaoDe Amap as the underlying base map, which inherently encompasses essential details about road and building shapes, relative…
▽ More
This paper presents GeoDecoder, a dedicated multimodal model designed for processing geospatial information in maps. Built on the BeitGPT architecture, GeoDecoder incorporates specialized expert modules for image and text processing. On the image side, GeoDecoder utilizes GaoDe Amap as the underlying base map, which inherently encompasses essential details about road and building shapes, relative positions, and other attributes. Through the utilization of rendering techniques, the model seamlessly integrates external data and features such as symbol markers, drive trajectories, heatmaps, and user-defined markers, eliminating the need for extra feature engineering. The text module of GeoDecoder accepts various context texts and question prompts, generating text outputs in the style of GPT. Furthermore, the GPT-based model allows for the training and execution of multiple tasks within the same model in an end-to-end manner. To enhance map cognition and enable GeoDecoder to acquire knowledge about the distribution of geographic entities in Beijing, we devised eight fundamental geospatial tasks and conducted pretraining of the model using large-scale text-image samples. Subsequently, rapid fine-tuning was performed on three downstream tasks, resulting in significant performance improvements. The GeoDecoder model demonstrates a comprehensive understanding of map elements and their associated operations, enabling efficient and high-quality application of diverse geospatial tasks in different business scenarios.
△ Less
Submitted 18 February, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
Data-driven Option Pricing
Authors:
Min Dai,
Hanqing Jin,
Xi Yang
Abstract:
We propose an innovative data-driven option pricing methodology that relies exclusively on the dataset of historical underlying asset prices. While the dataset is rooted in the objective world, option prices are commonly expressed as discounted expectations of their terminal payoffs in a risk-neutral world. Bridging this gap motivates us to identify a pricing kernel process, transforming option pr…
▽ More
We propose an innovative data-driven option pricing methodology that relies exclusively on the dataset of historical underlying asset prices. While the dataset is rooted in the objective world, option prices are commonly expressed as discounted expectations of their terminal payoffs in a risk-neutral world. Bridging this gap motivates us to identify a pricing kernel process, transforming option pricing into evaluating expectations in the objective world. We recover the pricing kernel by solving a utility maximization problem, and evaluate the expectations in terms of a functional optimization problem. Leveraging the deep learning technique, we design data-driven algorithms to solve both optimization problems over the dataset. Numerical experiments are presented to demonstrate the efficiency of our methodology.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
A Hybrid Quantum Computing Pipeline for Real World Drug Discovery
Authors:
Weitang Li,
Zhi Yin,
Xiaoran Li,
Dongqiang Ma,
Shuang Yi,
Zhenxing Zhang,
Chenji Zou,
Kunliang Bu,
Maochun Dai,
Jie Yue,
Yuzong Chen,
Xiaojin Zhang,
Shengyu Zhang
Abstract:
Quantum computing, with its superior computational capabilities compared to classical approaches, holds the potential to revolutionize numerous scientific domains, including pharmaceuticals. However, the application of quantum computing for drug discovery has primarily been limited to proof-of-concept studies, which often fail to capture the intricacies of real-world drug development challenges. I…
▽ More
Quantum computing, with its superior computational capabilities compared to classical approaches, holds the potential to revolutionize numerous scientific domains, including pharmaceuticals. However, the application of quantum computing for drug discovery has primarily been limited to proof-of-concept studies, which often fail to capture the intricacies of real-world drug development challenges. In this study, we diverge from conventional investigations by developing \rev{a hybrid} quantum computing pipeline tailored to address genuine drug design problems. Our approach underscores the application of quantum computation in drug discovery and propels it towards more scalable system. We specifically construct our versatile quantum computing pipeline to address two critical tasks in drug discovery: the precise determination of Gibbs free energy profiles for prodrug activation involving covalent bond cleavage, and the accurate simulation of covalent bond interactions. This work serves as a pioneering effort in benchmarking quantum computing against veritable scenarios encountered in drug design, especially the covalent bonding issue present in both of the case studies, thereby transitioning from theoretical models to tangible applications. Our results demonstrate the potential of a quantum computing pipeline for integration into real world drug design workflows.
△ Less
Submitted 24 July, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
Tantalum airbridges for scalable superconducting quantum processors
Authors:
Kunliang Bu,
Sainan Huai,
Zhenxing Zhang,
Dengfeng Li,
Yuan Li,
Jingjing Hu,
Xiaopei Yang,
Maochun Dai,
Tianqi Cai,
Yi-Cong Zheng,
Shengyu Zhang
Abstract:
The unique property of tantalum (Ta), particularly its long coherent lifetime in superconducting qubits and its exceptional resistance to both acid and alkali, makes it promising for superconducting quantum processors. It is a notable advantage to achieve high-performance quantum processors with neat and unified fabrication of all circuit elements, including coplanar waveguides (CPW), qubits, and…
▽ More
The unique property of tantalum (Ta), particularly its long coherent lifetime in superconducting qubits and its exceptional resistance to both acid and alkali, makes it promising for superconducting quantum processors. It is a notable advantage to achieve high-performance quantum processors with neat and unified fabrication of all circuit elements, including coplanar waveguides (CPW), qubits, and airbridges, on the tantalum film-based platform. Here, we propose a reliable tantalum airbridges with separate or fully-capped structure fabricated via a novel lift-off method, where a barrier layer with aluminium (Al) film is first introduced to separate two layers of photoresist and then etched away before the deposition of tantalum film, followed by cleaning with piranha solution to remove the residual photoresist on the chip. We characterize such tantalum airbridges as the control line jumpers, the ground plane crossovers and even coupling elements. They exhibit excellent connectivity, minimal capacitive loss, effectively suppress microwave and flux crosstalk and offer high freedom of coupling. Besides, by presenting a surface-13 tunable coupling superconducting quantum processor with median $T_1$ reaching above 100 $μ$s, the overall adaptability of tantalum airbridges is verified. The median single-qubit gate fidelity shows a tiny decrease from about 99.95% for the isolated Randomized Benchmarking to 99.94% for the simultaneous one. This fabrication method, compatible with all known superconducting materials, requires mild conditions of film deposition compared with the commonly used etching and grayscale lithography. Meanwhile, the experimental achievement of non-local coupling with controlled-Z (CZ) gate fidelity exceeding 99.2% may further facilitate qLDPC codes, laying a foundation for scalable quantum computation and quantum error correction with entirely tantalum elements.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
A Surrogate-Assisted Extended Generative Adversarial Network for Parameter Optimization in Free-Form Metasurface Design
Authors:
Manna Dai,
Yang Jiang,
Feng Yang,
Joyjit Chattoraj,
Yingzhi Xia,
Xinxing Xu,
Weijiang Zhao,
My Ha Dao,
Yong Liu
Abstract:
Metasurfaces have widespread applications in fifth-generation (5G) microwave communication. Among the metasurface family, free-form metasurfaces excel in achieving intricate spectral responses compared to regular-shape counterparts. However, conventional numerical methods for free-form metasurfaces are time-consuming and demand specialized expertise. Alternatively, recent studies demonstrate that…
▽ More
Metasurfaces have widespread applications in fifth-generation (5G) microwave communication. Among the metasurface family, free-form metasurfaces excel in achieving intricate spectral responses compared to regular-shape counterparts. However, conventional numerical methods for free-form metasurfaces are time-consuming and demand specialized expertise. Alternatively, recent studies demonstrate that deep learning has great potential to accelerate and refine metasurface designs. Here, we present XGAN, an extended generative adversarial network (GAN) with a surrogate for high-quality free-form metasurface designs. The proposed surrogate provides a physical constraint to XGAN so that XGAN can accurately generate metasurfaces monolithically from input spectral responses. In comparative experiments involving 20000 free-form metasurface designs, XGAN achieves 0.9734 average accuracy and is 500 times faster than the conventional methodology. This method facilitates the metasurface library building for specific spectral responses and can be extended to various inverse design problems, including optical metamaterials, nanophotonic devices, and drug discovery.
△ Less
Submitted 18 October, 2023;
originally announced January 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Learning Merton's Strategies in an Incomplete Market: Recursive Entropy Regularization and Biased Gaussian Exploration
Authors:
Min Dai,
Yuchao Dong,
Yanwei Jia,
Xun Yu Zhou
Abstract:
We study Merton's expected utility maximization problem in an incomplete market, characterized by a factor process in addition to the stock price process, where all the model primitives are unknown. We take the reinforcement learning (RL) approach to learn optimal portfolio policies directly by exploring the unknown market, without attempting to estimate the model parameters. Based on the entropy-…
▽ More
We study Merton's expected utility maximization problem in an incomplete market, characterized by a factor process in addition to the stock price process, where all the model primitives are unknown. We take the reinforcement learning (RL) approach to learn optimal portfolio policies directly by exploring the unknown market, without attempting to estimate the model parameters. Based on the entropy-regularization framework for general continuous-time RL formulated in Wang et al. (2020), we propose a recursive weighting scheme on exploration that endogenously discounts the current exploration reward by the past accumulative amount of exploration. Such a recursive regularization restores the optimality of Gaussian exploration. However, contrary to the existing results, the optimal Gaussian policy turns out to be biased in general, due to the interwinding needs for hedging and for exploration. We present an asymptotic analysis of the resulting errors to show how the level of exploration affects the learned policies. Furthermore, we establish a policy improvement theorem and design several RL algorithms to learn Merton's optimal strategies. At last, we carry out both simulation and empirical studies with a stochastic volatility environment to demonstrate the efficiency and robustness of the RL algorithms in comparison to the conventional plug-in method.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Authors:
Dami Choi,
Derrick Xin,
Hamid Dadkhahi,
Justin Gilmer,
Ankush Garg,
Orhan Firat,
Chih-Kuan Yeh,
Andrew M. Dai,
Behrooz Ghorbani
Abstract:
In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's be…
▽ More
In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Charge-density wave transition in magnetic topological semimetal EuAl$_4$
Authors:
R. Yang,
C. C. Le,
P. Zhu,
Z. W. Wang,
T. Shang,
Y. M. Dai,
J. P. Hu,
M. Dressel
Abstract:
The interplay among topology, charge-density wave (CDW), and magnetism can give rise to a plethora of exotic quantum phenomena. Recently, a group of magnetic topological semimetals with tetragonal lattices and CDW order were found to exhibit anomalous magnetic instability, helical spin ordering, and the presence of skyrmions. However, the underlying mechanism responsible for these observations rem…
▽ More
The interplay among topology, charge-density wave (CDW), and magnetism can give rise to a plethora of exotic quantum phenomena. Recently, a group of magnetic topological semimetals with tetragonal lattices and CDW order were found to exhibit anomalous magnetic instability, helical spin ordering, and the presence of skyrmions. However, the underlying mechanism responsible for these observations remains unclear. Here, we conducted a comprehensive investigation into the impact of CDW on the topological and magnetic properties of EuAl$_4$ using optical spectroscopy and the first-principles calculations. Through optical spectroscopy, we observed a partial gap (60~meV) on the Fermi surface and an enhanced mid-infrared absorption around 0.4~eV after the CDW transition. Magneto-optical spectroscopy and the first-principles calculations proved that, by affecting the band structure, the CDW order frustrates the antiferromagnetic interactions but strengthened the ferromagnetic ones, which can destabilize the magnetism. With lower symmetry in the CDW ordered state, carriers from the Weyl bands will mediate the anisotropic magnetic interactions promoting the formation of chiral spin textures. Conversely, without the CDW order, the counterpart EuGa$_4$ shows robust collinear antiferromagnetic order. Our findings uncover the pivotal role played by CDW order in arousing intricate magnetism in topological materials and provide valuable insights into controlling topological and magnetic properties through the manipulation of CDW orders.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Point, Segment and Count: A Generalized Framework for Object Counting
Authors:
Zhizhong Huang,
Mingliang Dai,
Yi Zhang,
Junping Zhang,
Hongming Shan
Abstract:
Class-agnostic object counting aims to count all objects in an image with respect to example boxes or class names, \emph{a.k.a} few-shot and zero-shot counting. In this paper, we propose a generalized framework for both few-shot and zero-shot object counting based on detection. Our framework combines the superior advantages of two foundation models without compromising their zero-shot capability:…
▽ More
Class-agnostic object counting aims to count all objects in an image with respect to example boxes or class names, \emph{a.k.a} few-shot and zero-shot counting. In this paper, we propose a generalized framework for both few-shot and zero-shot object counting based on detection. Our framework combines the superior advantages of two foundation models without compromising their zero-shot capability: (\textbf{i}) SAM to segment all possible objects as mask proposals, and (\textbf{ii}) CLIP to classify proposals to obtain accurate object counts. However, this strategy meets the obstacles of efficiency overhead and the small crowded objects that cannot be localized and distinguished. To address these issues, our framework, termed PseCo, follows three steps: point, segment, and count. Specifically, we first propose a class-agnostic object localization to provide accurate but least point prompts for SAM, which consequently not only reduces computation costs but also avoids missing small objects. Furthermore, we propose a generalized object classification that leverages CLIP image/text embeddings as the classifier, following a hierarchical knowledge distillation to obtain discriminative classifications among hierarchical mask proposals. Extensive experimental results on FSC-147, COCO, and LVIS demonstrate that PseCo achieves state-of-the-art performance in both few-shot/zero-shot object counting/detection. Code: https://github.com/Hzzone/PseCo
△ Less
Submitted 27 March, 2024; v1 submitted 21 November, 2023;
originally announced November 2023.
-
Non-uniqueness of forced active scalar equations with even drift operators
Authors:
Mimi Dai,
Susan Friedlander
Abstract:
We consider forced active scalar equations with even and homogeneous degree 0 drift operator on $\mathbb T^d$. Inspired by the non-uniqueness construction for dyadic fluid models, by implementing a sum-difference convex integration scheme we obtain non-unique weak solutions for the active scalar equation in space $C_t^0C_x^α$ with $α<\frac{1}{2d+1}$. We note that in 1D, the regularity $α<\frac13$…
▽ More
We consider forced active scalar equations with even and homogeneous degree 0 drift operator on $\mathbb T^d$. Inspired by the non-uniqueness construction for dyadic fluid models, by implementing a sum-difference convex integration scheme we obtain non-unique weak solutions for the active scalar equation in space $C_t^0C_x^α$ with $α<\frac{1}{2d+1}$. We note that in 1D, the regularity $α<\frac13$ is sharp as the energy identity is satisfied for solutions in $C^α$ with $α>\frac13$. Without external forcing, Isett and Vicol constructed non-unique weak solutions for such active scalar equations with spatial regularity $C_x^α$ for $α<\frac{1}{4d+1}$.
△ Less
Submitted 10 November, 2023;
originally announced November 2023.
-
Non-unique weak solutions of forced SQG
Authors:
Mimi Dai,
Qirui Peng
Abstract:
We construct non-unique weak solutions $θ\in C_t^0C_x^{0-}$ for forced surface quasi-geostrophic (SQG) equation. This is achieved through a convex integration scheme adapted to the sum-difference system of two distinct solutions. Without external forcing, non-unique weak solutions $θ$ in space $C_t^0C_x^α$ with $α<-\frac15$ were constructed by Buckmaster, Shkoller and Vicol, and Isett and Ma.
We construct non-unique weak solutions $θ\in C_t^0C_x^{0-}$ for forced surface quasi-geostrophic (SQG) equation. This is achieved through a convex integration scheme adapted to the sum-difference system of two distinct solutions. Without external forcing, non-unique weak solutions $θ$ in space $C_t^0C_x^α$ with $α<-\frac15$ were constructed by Buckmaster, Shkoller and Vicol, and Isett and Ma.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Multi-Domain Walking with Reduced-Order Models of Locomotion
Authors:
Min Dai,
Jaemin Lee,
Aaron D. Ames
Abstract:
Drawing inspiration from human multi-domain walking, this work presents a novel reduced-order model based framework for realizing multi-domain robotic walking. At the core of our approach is the viewpoint that human walking can be represented by a hybrid dynamical system, with continuous phases that are fully-actuated, under-actuated, and over-actuated and discrete changes in actuation type occurr…
▽ More
Drawing inspiration from human multi-domain walking, this work presents a novel reduced-order model based framework for realizing multi-domain robotic walking. At the core of our approach is the viewpoint that human walking can be represented by a hybrid dynamical system, with continuous phases that are fully-actuated, under-actuated, and over-actuated and discrete changes in actuation type occurring with changes in contact. Leveraging this perspective, we synthesize a multi-domain linear inverted pendulum (MLIP) model of locomotion. Utilizing the step-to-step dynamics of the MLIP model, we successfully demonstrate multi-domain walking behaviors on the bipedal robot Cassie -- a high degree of freedom 3D bipedal robot. Thus, we show the ability to bridge the gap between multi-domain reduced order models and full-order multi-contact locomotion. Additionally, our results showcase the ability of the proposed method to achieve versatile speed-tracking performance and robust push recovery behaviors.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
An improved algorithm for dynamical triangulations and simulations of finer lattices
Authors:
Mingwei Dai,
Walter Freeman,
Jack Laiho,
Marc Schiffer,
Judah Unmuth-Yockey
Abstract:
We introduce a new algorithm for the simulation of Euclidean dynamical triangulations that mimics the Metropolis-Hastings algorithm, but where all proposed moves are accepted. This rejection-free algorithm allows for the factorization of local and global terms in the action, a condition needed for efficient simulation of theories with global terms, while still maintaining detailed balance. We test…
▽ More
We introduce a new algorithm for the simulation of Euclidean dynamical triangulations that mimics the Metropolis-Hastings algorithm, but where all proposed moves are accepted. This rejection-free algorithm allows for the factorization of local and global terms in the action, a condition needed for efficient simulation of theories with global terms, while still maintaining detailed balance. We test our algorithm on the $2d$ Ising model, and against results for EDT obtained with standard Metropolis. Our new algorithm allows us to simulate EDT at finer lattice spacings than previously possible, and we find geometries that resemble semiclassical Euclidean de Sitter space in agreement with earlier results at coarser lattices. The agreement between lattice data and the classical de Sitter solution continues to get better as the lattice spacing decreases.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation
Authors:
Zhichao Wang,
Mengyu Dai,
Keld Lundgaard
Abstract:
The advent of ChatGPT has introduced innovative methods for information gathering and analysis. However, the information provided by ChatGPT is limited to text, and the visualization of this information remains constrained. Previous research has explored zero-shot text-to-video (TTV) approaches to transform text into videos. However, these methods lacked control over the identity of the generated…
▽ More
The advent of ChatGPT has introduced innovative methods for information gathering and analysis. However, the information provided by ChatGPT is limited to text, and the visualization of this information remains constrained. Previous research has explored zero-shot text-to-video (TTV) approaches to transform text into videos. However, these methods lacked control over the identity of the generated audio, i.e., not identity-agnostic, hindering their effectiveness. To address this limitation, we propose a novel two-stage framework for person-agnostic video cloning, specifically focusing on TTV generation. In the first stage, we leverage pretrained zero-shot models to achieve text-to-speech (TTS) conversion. In the second stage, an audio-driven talking head generation method is employed to produce compelling videos privided the audio generated in the first stage. This paper presents a comparative analysis of different TTS and audio-driven talking head generation methods, identifying the most promising approach for future research and development. Some audio and videos samples can be found in the following link: https://github.com/ZhichaoWang970201/Text-to-Video/tree/main.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
DialogRE^C+: An Extension of DialogRE to Investigate How Much Coreference Helps Relation Extraction in Dialogs
Authors:
Yiyun Xiong,
Mengwei Dai,
Fei Li,
Hao Fei,
Bobo Li,
Shengqiong Wu,
Donghong Ji,
Chong Teng
Abstract:
Dialogue relation extraction (DRE) that identifies the relations between argument pairs in dialogue text, suffers much from the frequent occurrence of personal pronouns, or entity and speaker coreference. This work introduces a new benchmark dataset DialogRE^C+, introducing coreference resolution into the DRE scenario. With the aid of high-quality coreference knowledge, the reasoning of argument r…
▽ More
Dialogue relation extraction (DRE) that identifies the relations between argument pairs in dialogue text, suffers much from the frequent occurrence of personal pronouns, or entity and speaker coreference. This work introduces a new benchmark dataset DialogRE^C+, introducing coreference resolution into the DRE scenario. With the aid of high-quality coreference knowledge, the reasoning of argument relations is expected to be enhanced. In DialogRE^C+ dataset, we manually annotate total 5,068 coreference chains over 36,369 argument mentions based on the existing DialogRE data, where four different coreference chain types namely speaker chain, person chain, location chain and organization chain are explicitly marked. We further develop 4 coreference-enhanced graph-based DRE models, which learn effective coreference representations for improving the DRE task. We also train a coreference resolution model based on our annotations and evaluate the effect of automatically extracted coreference chains demonstrating the practicality of our dataset and its potential to other domains and tasks.
△ Less
Submitted 12 August, 2023; v1 submitted 8 August, 2023;
originally announced August 2023.
-
Terahertz spin currents in nanoscale spatial resolution
Authors:
Jiahua Cai,
Mingcong Dai,
Sai Chen,
Peng Chen,
Jiaqi Wang,
Hongting Xiong,
Zejun Ren,
Shaojie Liu,
Zhongkai Liu,
Caihua Wan,
Xiaojun Wu
Abstract:
The ability to generate, detect, and control coherent terahertz (THz) spin currents with femtosecond temporal and nanoscale spatial resolution has significant ramifications. The diffraction limit of concentrated THz radiation, which has a wavelength range of 5 μm-1.5 mm, has impeded the accumulation of nanodomain data of magnetic structures and spintronic dynamics despite its potential benefits. C…
▽ More
The ability to generate, detect, and control coherent terahertz (THz) spin currents with femtosecond temporal and nanoscale spatial resolution has significant ramifications. The diffraction limit of concentrated THz radiation, which has a wavelength range of 5 μm-1.5 mm, has impeded the accumulation of nanodomain data of magnetic structures and spintronic dynamics despite its potential benefits. Contemporary spintronic optoelectronic apparatuses with dimensions 100 nm presented a challenge for researchers due to this restriction. In this study, we demonstrate the use of spintronic THz emission nanoscopy (STEN), which allows for the efficient injection and precise coherent detection of ultrafast THz spin currents at the nanoscale. Furthermore, STEN is an effective method that does not require invasion for characterising and etching nanoscale spintronic heterostructures. The cohesive integration of nanophotonics, nanospintronics, and THz-nano technology into a single platform is poised to accelerate the development of high-frequency spintronic optoelectronic nanodevices and their revolutionary technical applications.
△ Less
Submitted 1 July, 2023;
originally announced July 2023.
-
A Fast Fourier Convolutional Deep Neural Network For Accurate and Explainable Discrimination Of Wheat Yellow Rust And Nitrogen Deficiency From Sentinel-2 Time-Series Data
Authors:
Yue Shi,
Liangxiu Han,
Pablo González-Moreno,
Darren Dancey,
Wenjiang Huang,
Zhiqiang Zhang,
Yuanyuan Liu,
Mengning Huan,
Hong Miao,
Min Dai
Abstract:
Accurate and timely detection of plant stress is essential for yield protection, allowing better-targeted intervention strategies. Recent advances in remote sensing and deep learning have shown great potential for rapid non-invasive detection of plant stress in a fully automated and reproducible manner. However, the existing models always face several challenges: 1) computational inefficiency and…
▽ More
Accurate and timely detection of plant stress is essential for yield protection, allowing better-targeted intervention strategies. Recent advances in remote sensing and deep learning have shown great potential for rapid non-invasive detection of plant stress in a fully automated and reproducible manner. However, the existing models always face several challenges: 1) computational inefficiency and the misclassifications between the different stresses with similar symptoms; and 2) the poor interpretability of the host-stress interaction. In this work, we propose a novel fast Fourier Convolutional Neural Network (FFDNN) for accurate and explainable detection of two plant stresses with similar symptoms (i.e. Wheat Yellow Rust And Nitrogen Deficiency). Specifically, unlike the existing CNN models, the main components of the proposed model include: 1) a fast Fourier convolutional block, a newly fast Fourier transformation kernel as the basic perception unit, to substitute the traditional convolutional kernel to capture both local and global responses to plant stress in various time-scale and improve computing efficiency with reduced learning parameters in Fourier domain; 2) Capsule Feature Encoder to encapsulate the extracted features into a series of vector features to represent part-to-whole relationship with the hierarchical structure of the host-stress interactions of the specific stress. In addition, in order to alleviate over-fitting, a photochemical vegetation indices-based filter is placed as pre-processing operator to remove the non-photochemical noises from the input Sentinel-2 time series.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
Theoretical study of scalar meson $a_0(1710)$ in the $η_c \to {\bar{K}}^0K^+π^- $ reaction
Authors:
Yan Ding,
Xiao-Hui Zhang,
Meng-Yuan Dai,
En Wang,
De-Min Li,
Li-Sheng Geng,
Ju-Jun Xie
Abstract:
We investigate the process $η_c \to {\bar{K}}^0K^+π^-$ by taking into account the $S$-wave ${K^*\bar{K}^*}$ and $ρω$ interactions within the unitary coupled-channel approach, where the scalar meson $a_0(1710)$ is dynamically generated. In addition, the contributions from the intermediate resonances $K_0^*(1430)^{-}\to {\bar{K}}^0π^- $ and $K_0^*(1430)^{0}\to K^+π^-$ are also considered. We find a…
▽ More
We investigate the process $η_c \to {\bar{K}}^0K^+π^-$ by taking into account the $S$-wave ${K^*\bar{K}^*}$ and $ρω$ interactions within the unitary coupled-channel approach, where the scalar meson $a_0(1710)$ is dynamically generated. In addition, the contributions from the intermediate resonances $K_0^*(1430)^{-}\to {\bar{K}}^0π^- $ and $K_0^*(1430)^{0}\to K^+π^-$ are also considered. We find a significant dip structure around 1.8~GeV, associated to the $a_0(1710)$, in the ${\bar{K}^0K^+}$ invariant mass distribution, and the clear peaks of the $K_0^*(1430)$ in the ${\bar{K}}^0π^-$ and $K^+π^-$ invariant mass distributions, consistent with the {\it BABAR} measurements. We further estimate the branching fractions $\mathcal{B}(η_c \to \bar{K}^{*0}K^{\ast+}π^-)= 5.5\times10^{-3}$ and $\mathcal{B}(η_c \to ωρ^+π^-)= 7.9\times10^{-3}$. Our predictions can be tested by the BESIII and BelleII experiments in the future.
△ Less
Submitted 14 December, 2023; v1 submitted 28 June, 2023;
originally announced June 2023.
-
Global existence of 2D electron MHD near a steady state
Authors:
Mimi Dai
Abstract:
We study the electron magnetohydrodynamics (MHD) in two dimensional geometry, which has a rich family of steady states. In an anisotropic resistivity context, we show global in time existence of small smooth solution near a shear type steady state. Convergence rate of the solution to the steady state is also obtained.
We study the electron magnetohydrodynamics (MHD) in two dimensional geometry, which has a rich family of steady states. In an anisotropic resistivity context, we show global in time existence of small smooth solution near a shear type steady state. Convergence rate of the solution to the steady state is also obtained.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
Uniqueness for a stochastic ideal dyadic MHD model
Authors:
Mimi Dai,
Qirui Peng,
Cheng Ouyang
Abstract:
We study a stochastic dyadic model with both forward and backward energy cascade mechanisms for the inviscid and non-resistive magnetohydrodynamics. For a particular class of stochastic forcing, we show weak uniqueness for the stochastic system. However the solution dissipates the energy which is formally an invariant quantity for the system.
We study a stochastic dyadic model with both forward and backward energy cascade mechanisms for the inviscid and non-resistive magnetohydrodynamics. For a particular class of stochastic forcing, we show weak uniqueness for the stochastic system. However the solution dissipates the energy which is formally an invariant quantity for the system.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
Training Socially Aligned Language Models on Simulated Social Interactions
Authors:
Ruibo Liu,
Ruixin Yang,
Chenyan Jia,
Ge Zhang,
Denny Zhou,
Andrew M. Dai,
Diyi Yang,
Soroush Vosoughi
Abstract:
Social alignment in AI systems aims to ensure that these models behave according to established societal values. However, unlike humans, who derive consensus on value judgments through social interaction, current language models (LMs) are trained to rigidly replicate their training corpus in isolation, leading to subpar generalization in unfamiliar scenarios and vulnerability to adversarial attack…
▽ More
Social alignment in AI systems aims to ensure that these models behave according to established societal values. However, unlike humans, who derive consensus on value judgments through social interaction, current language models (LMs) are trained to rigidly replicate their training corpus in isolation, leading to subpar generalization in unfamiliar scenarios and vulnerability to adversarial attacks. This work presents a novel training paradigm that permits LMs to learn from simulated social interactions. In comparison to existing methodologies, our approach is considerably more scalable and efficient, demonstrating superior performance in alignment benchmarks and human evaluations. This paradigm shift in the training of LMs brings us a step closer to developing AI systems that can robustly and accurately reflect societal norms and values.
△ Less
Submitted 28 October, 2023; v1 submitted 26 May, 2023;
originally announced May 2023.
-
Are classification metrics good proxies for SN Ia cosmological constraining power?
Authors:
Alex I. Malz,
Mi Dai,
Kara A. Ponder,
Emille E. O. Ishida,
Santiago Gonzalez-Gaitain,
Rupesh Durgesh,
Alberto Krone-Martins,
Rafael S. de Souza,
Noble Kennamer,
Sreevarsha Sreejith,
Lluis Galbany,
The LSST Dark Energy Science Collaboration,
The Cosmostatistics Initiative
Abstract:
Context: When selecting a classifier to use for a supernova Ia (SN Ia) cosmological analysis, it is common to make decisions based on metrics of classification performance, i.e. contamination within the photometrically classified SN Ia sample, rather than a measure of cosmological constraining power. If the former is an appropriate proxy for the latter, this practice would save those designing an…
▽ More
Context: When selecting a classifier to use for a supernova Ia (SN Ia) cosmological analysis, it is common to make decisions based on metrics of classification performance, i.e. contamination within the photometrically classified SN Ia sample, rather than a measure of cosmological constraining power. If the former is an appropriate proxy for the latter, this practice would save those designing an analysis pipeline from the computational expense of a full cosmology forecast. Aims: This study tests the assumption that classification metrics are an appropriate proxy for cosmology metrics. Methods: We emulate photometric SN Ia cosmology samples with controlled contamination rates of individual contaminant classes and evaluate each of them under a set of classification metrics. We then derive cosmological parameter constraints from all samples under two common analysis approaches and quantify the impact of contamination by each contaminant class on the resulting cosmological parameter estimates. Results: We observe that cosmology metrics are sensitive to both the contamination rate and the class of the contaminating population, whereas the classification metrics are insensitive to the latter. Conclusions: We therefore discourage exclusive reliance on classification-based metrics for cosmological analysis design decisions, e.g. classifier choice, and instead recommend optimizing using a metric of cosmological parameter constraining power.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
PaLM 2 Technical Report
Authors:
Rohan Anil,
Andrew M. Dai,
Orhan Firat,
Melvin Johnson,
Dmitry Lepikhin,
Alexandre Passos,
Siamak Shakeri,
Emanuel Taropa,
Paige Bailey,
Zhifeng Chen,
Eric Chu,
Jonathan H. Clark,
Laurent El Shafey,
Yanping Huang,
Kathy Meier-Hellstern,
Gaurav Mishra,
Erica Moreira,
Mark Omernick,
Kevin Robinson,
Sebastian Ruder,
Yi Tay,
Kefan Xiao,
Yuanzhong Xu,
Yujing Zhang,
Gustavo Hernandez Abrego
, et al. (103 additional authors not shown)
Abstract:
We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on…
▽ More
We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM. This improved efficiency enables broader deployment while also allowing the model to respond faster, for a more natural pace of interaction. PaLM 2 demonstrates robust reasoning capabilities exemplified by large improvements over PaLM on BIG-Bench and other reasoning tasks. PaLM 2 exhibits stable performance on a suite of responsible AI evaluations, and enables inference-time control over toxicity without additional overhead or impact on other capabilities. Overall, PaLM 2 achieves state-of-the-art performance across a diverse set of tasks and capabilities.
When discussing the PaLM 2 family, it is important to distinguish between pre-trained models (of various sizes), fine-tuned variants of these models, and the user-facing products that use these models. In particular, user-facing products typically include additional pre- and post-processing steps. Additionally, the underlying models may evolve over time. Therefore, one should not expect the performance of user-facing products to exactly match the results reported in this report.
△ Less
Submitted 13 September, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.
-
A Low-Mass Helium Star Progenitor Model for the Type Ibn SN 2020nxt
Authors:
Qinan Wang,
Anika Goel,
Luc Dessart,
Ori D. Fox,
Melissa Shahbandeh,
Sofia Rest,
Armin Rest,
Jose H. Groh,
Andrew Allan,
Claes Fransson,
Nathan Smith,
Griffin Hosseinzadeh,
Alexei V. Filippenko,
Jennifer Andrews,
K. Azalee Bostroem,
Thomas G. Brink,
Peter Brown,
Jamison Burke,
Roger Chevalier,
Geoffrey C. Clayton,
Mi Dai,
Kyle W. Davis,
Ryan J. Foley,
Sebastian Gomez,
Chelsea Harris
, et al. (33 additional authors not shown)
Abstract:
A growing number of supernovae (SNe) are now known to exhibit evidence for significant interaction with a dense, pre-existing, circumstellar medium (CSM). SNe Ibn comprise one such class that can be characterised by both rapidly evolving light curves and persistent narrow He I lines. The origin of such a dense CSM in these systems remains a pressing question, specifically concerning the progenitor…
▽ More
A growing number of supernovae (SNe) are now known to exhibit evidence for significant interaction with a dense, pre-existing, circumstellar medium (CSM). SNe Ibn comprise one such class that can be characterised by both rapidly evolving light curves and persistent narrow He I lines. The origin of such a dense CSM in these systems remains a pressing question, specifically concerning the progenitor system and mass-loss mechanism. In this paper, we present multi-wavelength data of the Type Ibn SN 2020nxt, including $HST$/STIS ultraviolet spectra. We fit the data with recently updated CMFGEN models designed to handle configurations for SNe Ibn. The UV coverage yields strong constraints on the energetics and, when combined with the CMFGEN models, offer new insight on potential progenitor systems. We find the most successful model is a $\lesssim4 {\rm M}_\odot$ helium star that lost its $\sim 1\,{\rm M}_\odot$ He-rich envelope in the years preceding core collapse. We also consider viable alternatives, such as a He white dwarf merger. Ultimately, we conclude at least some SNe Ibn do not arise from single, massive ($>30 {\rm M}_\odot$) Wolf-Rayet-like stars.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
Accelerated Screening of Ternary Chalcogenides for High-Performance Optoelectronic Materials
Authors:
Chen Shen,
Tianshu Li,
Yixuan Zhang,
Teng Long,
Nuno Miguel Fortunato,
Fei Liang,
Mian Dai,
Jiahong Shen,
Chris Wolverton,
Hongbin Zhang
Abstract:
Chalcogenides, which refer to chalcogen anions, have attracted considerable attention in multiple fields of applications, such as optoelectronics, thermoelectrics, transparent contacts, and thin film transistors. In comparison to oxide counterparts, chalcogenides have demonstrated higher mobility and \textit{p}-type dopability, owing to larger orbital overlaps between metal-X covalent chemical bon…
▽ More
Chalcogenides, which refer to chalcogen anions, have attracted considerable attention in multiple fields of applications, such as optoelectronics, thermoelectrics, transparent contacts, and thin film transistors. In comparison to oxide counterparts, chalcogenides have demonstrated higher mobility and \textit{p}-type dopability, owing to larger orbital overlaps between metal-X covalent chemical bondings and higher-energy valence bands derived by p-orbitals. Despite the potential of chalcogenides, the number of successfully synthesized compounds remains relatively low compared to oxides, suggesting the presence of numerous unexplored chalcogenides with fascinating physical characteristics. In this study, we implemented a systematic high-throughput screening process combined with first-principles calculations on ternary chalcogenides using 34 crystal structure prototypes. We generated a computational material database containing over 400,000 compounds by exploiting the ion-substitution approach at different atomic sites with elements in the periodic table. The thermodynamic stabilities of the candidates were validated using the chalcogenides included in the Open Quantum Materials Database. Moreover, we trained a model based on Crystal Graph Convolutional Neural Networks to predict the thermodynamic stability of novel materials. Furthermore, we theoretically evaluated the electronic structures of the stable candidates using accurate hybrid functionals. A series of in-depth characteristics, including the carrier effective masses, electronic configuration, and photovoltaic conversion efficiency, was also investigated. Our work provides useful guidance for further experimental research in the synthesis and characterization of such chalcogenides as promising candidates, as well as charting the stability and optoelectronic performance of ternary chalcogenides.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Maximize the Long-term Average Revenue of Network Slice Provider via Admission Control Among Heterogeneous Slices
Authors:
Miao Dai,
Gang Sun,
Hongfang Yu,
Dusit Niyato
Abstract:
Network slicing endows 5G/B5G with differentiated and customized capabilities to cope with the proliferation of diversified services, whereas limited physical network resources may not be able to support all service requests. Slice admission control is regarded as an essential means to ensure service quality and service isolation when the network is under burden. Herein, the scenario where rationa…
▽ More
Network slicing endows 5G/B5G with differentiated and customized capabilities to cope with the proliferation of diversified services, whereas limited physical network resources may not be able to support all service requests. Slice admission control is regarded as an essential means to ensure service quality and service isolation when the network is under burden. Herein, the scenario where rational tenants coexist with partially competitive network slice providers is adopted. We aim to maximize the long-term average revenue of the network operators through slice admission control, with the feasibility of multidimensional resource requirements, the priority differences among heterogeneous slices, and the admission fairness within each slice taken into account concurrently. We prove the intractability of our problem by a reduction from the Multidimensional Knapsack Problem (MKP), and propose a two-stage algorithm called MPSAC to make a sub-optimal solution efficiently. The principle of MPSAC is to split the original problem into two sub-problems; inter-slice decision-making and intra-slice quota allocation, which are solved using a heuristic method and a tailored auction mechanism respectively. Extensive simulations are carried out to demonstrate the efficacy of our algorithm, the results show that the long-term average revenue of ours is at least 9.6% higher than comparisons while maintaining better priority relations and achieving improved fairness performance.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
Cross-head Supervision for Crowd Counting with Noisy Annotations
Authors:
Mingliang Dai,
Zhizhong Huang,
Jiaqi Gao,
Hongming Shan,
Junping Zhang
Abstract:
Noisy annotations such as missing annotations and location shifts often exist in crowd counting datasets due to multi-scale head sizes, high occlusion, etc. These noisy annotations severely affect the model training, especially for density map-based methods. To alleviate the negative impact of noisy annotations, we propose a novel crowd counting model with one convolution head and one transformer…
▽ More
Noisy annotations such as missing annotations and location shifts often exist in crowd counting datasets due to multi-scale head sizes, high occlusion, etc. These noisy annotations severely affect the model training, especially for density map-based methods. To alleviate the negative impact of noisy annotations, we propose a novel crowd counting model with one convolution head and one transformer head, in which these two heads can supervise each other in noisy areas, called Cross-Head Supervision. The resultant model, CHS-Net, can synergize different types of inductive biases for better counting. In addition, we develop a progressive cross-head supervision learning strategy to stabilize the training process and provide more reliable supervision. Extensive experimental results on ShanghaiTech and QNRF datasets demonstrate superior performance over state-of-the-art methods. Code is available at https://github.com/RaccoonDML/CHSNet.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.