subscribe to arXiv mailings

Incremental Open-set Domain Adaptation

Authors: Sayan Rakshit, Hmrishav Bandyopadhyay, Nibaran Das, Biplab Banerjee

Abstract: Catastrophic forgetting makes neural network models unstable when learning visual domains consecutively. The neural network model drifts to catastrophic forgetting-induced low performance of previously learnt domains when training with new domains. We illuminate this current neural network model weakness and develop a forgetting-resistant incremental learning strategy. Here, we propose a new unsup… ▽ More Catastrophic forgetting makes neural network models unstable when learning visual domains consecutively. The neural network model drifts to catastrophic forgetting-induced low performance of previously learnt domains when training with new domains. We illuminate this current neural network model weakness and develop a forgetting-resistant incremental learning strategy. Here, we propose a new unsupervised incremental open-set domain adaptation (IOSDA) issue for image classification. Open-set domain adaptation adds complexity to the incremental domain adaptation issue since each target domain has more classes than the Source domain. In IOSDA, the model learns training with domain streams phase by phase in incremented time. Inference uses test data from all target domains without revealing their identities. We proposed IOSDA-Net, a two-stage learning pipeline, to solve the problem. The first module replicates prior domains from random noise using a generative framework and creates a pseudo source domain. In the second step, this pseudo source is adapted to the present target domain. We test our model on Office-Home, DomainNet, and UPRN-RSDA, a newly curated optical remote sensing dataset. △ Less

Submitted 31 August, 2024; originally announced September 2024.

arXiv:2409.00397 [pdf, other]

COSMo: CLIP Talks on Open-Set Multi-Target Domain Adaptation

Authors: Munish Monga, Sachin Kumar Giroh, Ankit Jha, Mainak Singha, Biplab Banerjee, Jocelyn Chanussot

Abstract: Multi-Target Domain Adaptation (MTDA) entails learning domain-invariant information from a single source domain and applying it to multiple unlabeled target domains. Yet, existing MTDA methods predominantly focus on addressing domain shifts within visual features, often overlooking semantic features and struggling to handle unknown classes, resulting in what is known as Open-Set (OS) MTDA. While l… ▽ More Multi-Target Domain Adaptation (MTDA) entails learning domain-invariant information from a single source domain and applying it to multiple unlabeled target domains. Yet, existing MTDA methods predominantly focus on addressing domain shifts within visual features, often overlooking semantic features and struggling to handle unknown classes, resulting in what is known as Open-Set (OS) MTDA. While large-scale vision-language foundation models like CLIP show promise, their potential for MTDA remains largely unexplored. This paper introduces COSMo, a novel method that learns domain-agnostic prompts through source domain-guided prompt learning to tackle the MTDA problem in the prompt space. By leveraging a domain-specific bias network and separate prompts for known and unknown classes, COSMo effectively adapts across domain and class shifts. To the best of our knowledge, COSMo is the first method to address Open-Set Multi-Target DA (OSMTDA), offering a more realistic representation of real-world scenarios and addressing the challenges of both open-set and multi-target DA. COSMo demonstrates an average improvement of $5.1\%$ across three challenging datasets: Mini-DomainNet, Office-31, and Office-Home, compared to other related DA methods adapted to operate within the OSMTDA setting. Code is available at: https://github.com/munish30monga/COSMo △ Less

Submitted 31 August, 2024; originally announced September 2024.

Comments: Accepted in BMVC 2024

arXiv:2407.21456 [pdf, other]

A Ball Divergence Based Measure For Conditional Independence Testing

Authors: Bilol Banerjee, Bhaswar B. Bhattacharya, Anil K. Ghosh

Abstract: In this paper we introduce a new measure of conditional dependence between two random vectors ${\boldsymbol X}$ and ${\boldsymbol Y}$ given another random vector $\boldsymbol Z$ using the ball divergence. Our measure characterizes conditional independence and does not require any moment assumptions. We propose a consistent estimator of the measure using a kernel averaging technique and derive its… ▽ More In this paper we introduce a new measure of conditional dependence between two random vectors ${\boldsymbol X}$ and ${\boldsymbol Y}$ given another random vector $\boldsymbol Z$ using the ball divergence. Our measure characterizes conditional independence and does not require any moment assumptions. We propose a consistent estimator of the measure using a kernel averaging technique and derive its asymptotic distribution. Using this statistic we construct two tests for conditional independence, one in the model-${\boldsymbol X}$ framework and the other based on a novel local wild bootstrap algorithm. In the model-${\boldsymbol X}$ framework, which assumes the knowledge of the distribution of ${\boldsymbol X}|{\boldsymbol Z}$, applying the conditional randomization test we obtain a method that controls Type I error in finite samples and is asymptotically consistent, even if the distribution of ${\boldsymbol X}|{\boldsymbol Z}$ is incorrectly specified up to distance preserving transformations. More generally, in situations where ${\boldsymbol X}|{\boldsymbol Z}$ is unknown or hard to estimate, we design a double-bandwidth based local wild bootstrap algorithm that asymptotically controls both Type I error and power. We illustrate the advantage of our method, both in terms of Type I error and power, in a range of simulation settings and also in a real data example. A consequence of our theoretical results is a general framework for studying the asymptotic properties of a 2-sample conditional $V$-statistic, which is of independent interest. △ Less

Submitted 31 July, 2024; originally announced July 2024.

arXiv:2407.12867 [pdf, other]

Swift-BAT GUANO follow-up of gravitational-wave triggers in the third LIGO-Virgo-KAGRA observing run

Authors: Gayathri Raman, Samuele Ronchini, James Delaunay, Aaron Tohuvavohu, Jamie A. Kennea, Tyler Parsotan, Elena Ambrosi, Maria Grazia Bernardini, Sergio Campana, Giancarlo Cusumano, Antonino D'Ai, Paolo D'Avanzo, Valerio D'Elia, Massimiliano De Pasquale, Simone Dichiara, Phil Evans, Dieter Hartmann, Paul Kuin, Andrea Melandri, Paul O'Brien, Julian P. Osborne, Kim Page, David M. Palmer, Boris Sbarufatti, Gianpiero Tagliaferri , et al. (1797 additional authors not shown)

Abstract: We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wav… ▽ More We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wave Transient Catalogs (GWTC-3). Targeted searches were carried out on the entire GW sample using the maximum--likelihood NITRATES pipeline on the BAT data made available via the GUANO infrastructure. We do not detect any significant electromagnetic emission that is temporally and spatially coincident with any of the GW candidates. We report flux upper limits in the 15-350 keV band as a function of sky position for all the catalog candidates. For GW candidates where the Swift-BAT false alarm rate is less than 10$^{-3}$ Hz, we compute the GW--BAT joint false alarm rate. Finally, the derived Swift-BAT upper limits are used to infer constraints on the putative electromagnetic emission associated with binary black hole mergers. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 50 pages, 10 figures, 4 tables

arXiv:2407.05145 [pdf, other]

On high-dimensional modifications of the nearest neighbor classifier

Authors: Annesha Ghosh, Bilol Banerjee, Anil K. Ghosh

Abstract: Nearest neighbor classifier is arguably the most simple and popular nonparametric classifier available in the literature. However, due to the concentration of pairwise distances and the violation of the neighborhood structure, this classifier often suffers in high-dimension, low-sample size (HDLSS) situations, especially when the scale difference between the competing classes dominates their locat… ▽ More Nearest neighbor classifier is arguably the most simple and popular nonparametric classifier available in the literature. However, due to the concentration of pairwise distances and the violation of the neighborhood structure, this classifier often suffers in high-dimension, low-sample size (HDLSS) situations, especially when the scale difference between the competing classes dominates their location difference. Several attempts have been made in the literature to take care of this problem. In this article, we discuss some of these existing methods and propose some new ones. We carry out some theoretical investigations in this regard and analyze several simulated and benchmark datasets to compare the empirical performances of proposed methods with some of the existing ones. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.04207 [pdf, other]

Elevating All Zero-Shot Sketch-Based Image Retrieval Through Multimodal Prompt Learning

Authors: Mainak Singha, Ankit Jha, Divyam Gupta, Pranav Singla, Biplab Banerjee

Abstract: We address the challenges inherent in sketch-based image retrieval (SBIR) across various settings, including zero-shot SBIR, generalized zero-shot SBIR, and fine-grained zero-shot SBIR, by leveraging the vision-language foundation model CLIP. While recent endeavors have employed CLIP to enhance SBIR, these approaches predominantly follow uni-modal prompt processing and overlook to exploit CLIP's i… ▽ More We address the challenges inherent in sketch-based image retrieval (SBIR) across various settings, including zero-shot SBIR, generalized zero-shot SBIR, and fine-grained zero-shot SBIR, by leveraging the vision-language foundation model CLIP. While recent endeavors have employed CLIP to enhance SBIR, these approaches predominantly follow uni-modal prompt processing and overlook to exploit CLIP's integrated visual and textual capabilities fully. To bridge this gap, we introduce SpLIP, a novel multi-modal prompt learning scheme designed to operate effectively with frozen CLIP backbones. We diverge from existing multi-modal prompting methods that treat visual and textual prompts independently or integrate them in a limited fashion, leading to suboptimal generalization. SpLIP implements a bi-directional prompt-sharing strategy that enables mutual knowledge exchange between CLIP's visual and textual encoders, fostering a more cohesive and synergistic prompt processing mechanism that significantly reduces the semantic gap between the sketch and photo embeddings. In addition to pioneering multi-modal prompt learning, we propose two innovative strategies for further refining the embedding space. The first is an adaptive margin generation for the sketch-photo triplet loss, regulated by CLIP's class textual embeddings. The second introduces a novel task, termed conditional cross-modal jigsaw, aimed at enhancing fine-grained sketch-photo alignment by implicitly modeling sketches' viable patch arrangement using knowledge of unshuffled photos. Our comprehensive experimental evaluations across multiple benchmarks demonstrate the superior performance of SpLIP in all three SBIR scenarios. Project page: https://mainaksingha01.github.io/SpLIP/ . △ Less

Submitted 22 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

Comments: Accepted in ECCV 2024

arXiv:2405.15855 [pdf, other]

Camelidae on BOAT: observation of a second spectral component in GRB 221009A

Authors: Biswajit Banerjee, Samanta Macera, Alessio Ludovico De Santis, Alessio Mei, Jacopo Tissino, Gor Oganesyan, Dmitry D. Frederiks, Alexandra L. Lysenko, Dmitry S. Svinkin, Anastasia E. Tsvetkova, Marica Branchesi

Abstract: Observing and understanding the origin of the very-high-energy (VHE) spectral component in gamma-ray bursts (GRBs) has been challenging because of the lack of sensitivity in MeV-GeV observations, so far. The majestic GRB 221009A, known as the brightest of all times (BOAT), offers a unique opportunity to identify spectral components during the prompt and early afterglow phases and probe their origi… ▽ More Observing and understanding the origin of the very-high-energy (VHE) spectral component in gamma-ray bursts (GRBs) has been challenging because of the lack of sensitivity in MeV-GeV observations, so far. The majestic GRB 221009A, known as the brightest of all times (BOAT), offers a unique opportunity to identify spectral components during the prompt and early afterglow phases and probe their origin. Analyzing simultaneous observations spanning from keV to TeV energies, we identified two distinct spectral components during the initial 20 minutes of the burst. The second spectral component peaks between $10-300$ GeV, and the bolometric fluence (10 MeV-10 TeV) is estimated to be greater than 2$\times10^{-3}$ erg/ cm$^{2}$. Performing broad-band spectral modeling, we provide constraints on the magnetic field and the energies of electrons accelerated in the external relativistic shock. We interpret the VHE component as an afterglow emission that is affected by luminous prompt MeV radiation at early times. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: Comments and suggestions are welcome

arXiv:2405.13559 [pdf]

Identification of microstructure from macroscopic measurement using inverse multiscale analysis

Authors: Anjan Mukherjee, Biswanth Banerjee

Abstract: Most of the tailored materials are heterogeneous at the ingredient level. Analysis of those heterogeneous structures requires the knowledge of microstructure. With the knowledge of microstructure, multiscale analysis is carried out with homogenization at the micro level. Second-order homogenization is carried out whenever the ingredient size is comparable to the structure size. Therefore, knowledg… ▽ More Most of the tailored materials are heterogeneous at the ingredient level. Analysis of those heterogeneous structures requires the knowledge of microstructure. With the knowledge of microstructure, multiscale analysis is carried out with homogenization at the micro level. Second-order homogenization is carried out whenever the ingredient size is comparable to the structure size. Therefore, knowledge of microstructure and its size is indispensable to analyzing those heterogeneous structures. Again, any structural response contains all the information of microstructure, like microstructure distribution, volume fraction, size of ingredients, etc. Here, inverse analysis is carried out to identify a heterogeneous microstructure from macroscopic measurement. Two-step inverse analysis is carried out in the identification process; in the first step, the macrostructures length scale and effective properties are identified from the macroscopic measurement using gradient-based optimization. In the second step, those effective properties and length scales are used to determine the microstructure in inverse second-order homogenization. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Structural Engineering Convention SEC 2023

arXiv:2405.13384 [pdf, other]

Elastic-gap free strain gradient crystal plasticity model that effectively account for plastic slip gradient and grain boundary dissipation

Authors: Anjan Mukherjee, Biswanath Banerjee

Abstract: This paper proposes an elastic-gap free strain gradient crystal plasticity model that addresses dissipation caused by plastic slip gradient and grain boundary (GB) Burger tensor. The model involves splitting plastic slip gradient and GB Burger tensor into energetic dissipative quantities. Unlike conventional models, the bulk and GB defect energy are considered to be a quadratic functional of the e… ▽ More This paper proposes an elastic-gap free strain gradient crystal plasticity model that addresses dissipation caused by plastic slip gradient and grain boundary (GB) Burger tensor. The model involves splitting plastic slip gradient and GB Burger tensor into energetic dissipative quantities. Unlike conventional models, the bulk and GB defect energy are considered to be a quadratic functional of the energetic portion of slip gradient and GB Burgers tensor. The higher-order stresses for each individual slip systems and GB stresses are derived from the defect energy, following a similar evolution as the Armstrong-Frederick type backstress model in classical plasticity. The evolution equations consist of a hardening and a relaxation term. The relaxation term brings the nonlinearity in hardening and causes an additional dissipation. The applicability of the proposed model is numerically established with the help of two-dimensional finite element implementation. Specifically, the bulk and GB relaxation coefficients are critically evaluated based on various circumstances, considering single crystal infinite shear layer, periodic bicrystal shearing, and bicrystal tension problem. In contrast to the Gurtin-type model, the proposed model smoothly captures the apparent strengthening at saturation without causing any abrupt stress jump under non-proportional loading conditions. Moreover, when subjected to cyclic loading, the stress-strain curve maintains its curvature during reverse loading. The numerical simulation reveals that the movement of geometrically necessary dislocation (GND) towards the GB is influenced by the bulk recovery coefficient, while the dissipation and amount of accumulation of GND near the GB are controlled by the GB recovery coefficient. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Submitted in Journal of the Mechanics and Physics of Solids

arXiv:2405.09612 [pdf, other]

Imprint of 'local opacity' effect in gamma-ray spectrum of blazar jet

Authors: Sushmita Agarwal, Amit Shukla, Karl Mannheim, Bhargav Vaidya, Biswajit Banerjee

Abstract: Relativistic jets from accreting supermassive black holes at cosmological distances can be powerful emitters of $γ$-rays. However, the precise mechanisms and locations responsible for the dissipation of energy within these jets, leading to observable $γ$-ray radiation, remain elusive. We detect evidence for an intrinsic absorption feature in the $γ$-ray spectrum at energies exceeding $10\,$GeV, pr… ▽ More Relativistic jets from accreting supermassive black holes at cosmological distances can be powerful emitters of $γ$-rays. However, the precise mechanisms and locations responsible for the dissipation of energy within these jets, leading to observable $γ$-ray radiation, remain elusive. We detect evidence for an intrinsic absorption feature in the $γ$-ray spectrum at energies exceeding $10\,$GeV, presumably due to the photon-photon pair production of $γ$-rays with low ionization lines at the outer edge of Broad-line region (BLR), during the high-flux state of the flat-spectrum radio quasar PKS 1424$-$418. The feature can be discriminated from the turnover at higher energies resulting from $γ$-ray absorption in the extragalactic background light. It is absent in the low-flux states supporting the interpretation that powerful dissipation events within or at the edge of the BLR evolve into fainter $γ$-ray emitting zones outside the BLR, possibly associated with the moving VLBI radio knots. The inferred location of $γ$-ray emission zone is consistent with the observed variability time scale of the brightest flare, provided that the flare is attributed to external Compton scattering with BLR photons. △ Less

Submitted 18 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

Comments: 10 pages, 3 figures, 1 table, Accepted for publication in ApJL

arXiv:2405.09149 [pdf, other]

Exploring uniformity and maximum entropy distribution on torus through intrinsic geometry: Application to protein-chemistry

Authors: Surojit Biswas, Buddhananda Banerjee

Abstract: A generic family of distributions, defined on the surface of a curved torus is introduced using the area element of it. The area uniformity and the maximum entropy distribution are identified using the trigonometric moments of the proposed family. A marginal distribution is obtained as a three-parameter modification of the von Mises distribution that encompasses the von Mises, Cardioid, and Unifor… ▽ More A generic family of distributions, defined on the surface of a curved torus is introduced using the area element of it. The area uniformity and the maximum entropy distribution are identified using the trigonometric moments of the proposed family. A marginal distribution is obtained as a three-parameter modification of the von Mises distribution that encompasses the von Mises, Cardioid, and Uniform distributions as special cases. The proposed family of the marginal distribution exhibits both symmetric and asymmetric, unimodal or bimodal shapes, contingent upon parameters. Furthermore, we scrutinize a two-parameter symmetric submodel, examining its moments, measure of variation, Kullback-Leibler divergence, and maximum likelihood estimation, among other properties. In addition, we introduce a modified acceptance-rejection sampling with a thin envelope obtained from the upper-Riemann-sum of a circular density, achieving a high rate of acceptance. This proposed sampling scheme will accelerate the empirical studies for a large-scale simulation reducing the processing time. Furthermore, we extend the Uniform, Wrapped Cauchy, and Kato-Jones distributions to the surface of the curved torus and implemented the proposed bivariate toroidal distribution for different groups of protein data, namely, $α$-helix, $β$-sheet, and their mixture. A marginal of this proposed distribution is fitted to the wind direction data. △ Less

Submitted 16 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2304.01599

arXiv:2405.01040 [pdf, other]

Few Shot Class Incremental Learning using Vision-Language models

Authors: Anurag Kumar, Chinmay Bharti, Saikat Dutta, Srikrishna Karanam, Biplab Banerjee

Abstract: Recent advancements in deep learning have demonstrated remarkable performance comparable to human capabilities across various supervised computer vision tasks. However, the prevalent assumption of having an extensive pool of training data encompassing all classes prior to model training often diverges from real-world scenarios, where limited data availability for novel classes is the norm. The cha… ▽ More Recent advancements in deep learning have demonstrated remarkable performance comparable to human capabilities across various supervised computer vision tasks. However, the prevalent assumption of having an extensive pool of training data encompassing all classes prior to model training often diverges from real-world scenarios, where limited data availability for novel classes is the norm. The challenge emerges in seamlessly integrating new classes with few samples into the training data, demanding the model to adeptly accommodate these additions without compromising its performance on base classes. To address this exigency, the research community has introduced several solutions under the realm of few-shot class incremental learning (FSCIL). In this study, we introduce an innovative FSCIL framework that utilizes language regularizer and subspace regularizer. During base training, the language regularizer helps incorporate semantic information extracted from a Vision-Language model. The subspace regularizer helps in facilitating the model's acquisition of nuanced connections between image and text semantics inherent to base classes during incremental training. Our proposed framework not only empowers the model to embrace novel classes with limited data, but also ensures the preservation of performance on base classes. To substantiate the efficacy of our approach, we conduct comprehensive experiments on three distinct FSCIL benchmarks, where our framework attains state-of-the-art performance. △ Less

Submitted 15 August, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.16499 [pdf, other]

Host star properties of hot, warm and cold Jupiters in the solar neighborhood from \textit{Gaia} DR3: clues to formation pathways

Authors: Bihan Banerjee, Mayank Narang, P. Manoj, Thomas Henning, Himanshu Tyagi, Arun Surya, Prasanta K. Nayak, Mihir Tripathi

Abstract: Giant planets exhibit diverse orbital properties, hinting at their distinct formation and dynamic histories. In this paper, using $\textit{Gaia}$ DR3, we investigate if and how the orbital properties of Jupiters are linked to their host star properties, particularly their metallicity and age. We obtain metallicities for main sequence stars of spectral type F, G, and K, hosting hot, warm, and cold… ▽ More Giant planets exhibit diverse orbital properties, hinting at their distinct formation and dynamic histories. In this paper, using $\textit{Gaia}$ DR3, we investigate if and how the orbital properties of Jupiters are linked to their host star properties, particularly their metallicity and age. We obtain metallicities for main sequence stars of spectral type F, G, and K, hosting hot, warm, and cold Jupiters with varying eccentricities. We compute the velocity dispersion of host stars of these three groups using kinematic information from $\textit{Gaia}$ DR3 and obtain average ages using velocity dispersion-age relation. We find that host stars of hot Jupiters are relatively metal-rich ([Fe/H]=$0.18 \pm 0.13$) and young ( median age $3.97 \pm 0.51$ Gyr) compared to the host stars of cold Jupiters in nearly circular orbits, which are relatively metal-poor ($0.03 \pm 0.18$) and older (median age $6.07 \pm 0.79$ Gyr). Host stars of cold Jupiters in high eccentric orbits, on the other hand, show metallicities similar to that of the hosts of hot Jupiters, but are older, on average (median age $6.25 \pm 0.92$ Gyr). The similarity in metallicity between hosts of hot Jupiters and hosts of cold Jupiters in high eccentric orbits supports high eccentricity migration as the potential origin of hot Jupiters, with the latter serving as the progenitors. However, the average age difference between them suggests that the older hot Jupiters may have been engulfed by the star in a timescale of $\sim 6$ Gyr. This allows us to estimate the value of stellar tidal quality factor $Q'_\ast\sim10^{6\pm1}$. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.05366 [pdf, other]

CDAD-Net: Bridging Domain Gaps in Generalized Category Discovery

Authors: Sai Bhargav Rongali, Sarthak Mehrotra, Ankit Jha, Mohamad Hassan N C, Shirsha Bose, Tanisha Gupta, Mainak Singha, Biplab Banerjee

Abstract: In Generalized Category Discovery (GCD), we cluster unlabeled samples of known and novel classes, leveraging a training dataset of known classes. A salient challenge arises due to domain shifts between these datasets. To address this, we present a novel setting: Across Domain Generalized Category Discovery (AD-GCD) and bring forth CDAD-NET (Class Discoverer Across Domains) as a remedy. CDAD-NET is… ▽ More In Generalized Category Discovery (GCD), we cluster unlabeled samples of known and novel classes, leveraging a training dataset of known classes. A salient challenge arises due to domain shifts between these datasets. To address this, we present a novel setting: Across Domain Generalized Category Discovery (AD-GCD) and bring forth CDAD-NET (Class Discoverer Across Domains) as a remedy. CDAD-NET is architected to synchronize potential known class samples across both the labeled (source) and unlabeled (target) datasets, while emphasizing the distinct categorization of the target data. To facilitate this, we propose an entropy-driven adversarial learning strategy that accounts for the distance distributions of target samples relative to source-domain class prototypes. Parallelly, the discriminative nature of the shared space is upheld through a fusion of three metric learning objectives. In the source domain, our focus is on refining the proximity between samples and their affiliated class prototypes, while in the target domain, we integrate a neighborhood-centric contrastive learning mechanism, enriched with an adept neighborsmining approach. To further accentuate the nuanced feature interrelation among semantically aligned images, we champion the concept of conditional image inpainting, underscoring the premise that semantically analogous images prove more efficacious to the task than their disjointed counterparts. Experimentally, CDAD-NET eclipses existing literature with a performance increment of 8-15% on three AD-GCD benchmarks we present. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: Accepted in L3D-IVU, CVPR Workshop, 2024

arXiv:2404.04248 [pdf, other]

doi 10.3847/2041-8213/ad5beb

Observation of Gravitational Waves from the Coalescence of a $2.5\text{-}4.5~M_\odot$ Compact Object and a Neutron Star

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, S. Akçay, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah , et al. (1771 additional authors not shown)

Abstract: We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the so… ▽ More We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the source has a mass less than $5~M_\odot$ at 99% credibility. We cannot definitively determine from gravitational-wave data alone whether either component of the source is a neutron star or a black hole. However, given existing estimates of the maximum neutron star mass, we find the most probable interpretation of the source to be the coalescence of a neutron star with a black hole that has a mass between the most massive neutron stars and the least massive black holes observed in the Galaxy. We provisionally estimate a merger rate density of $55^{+127}_{-47}~\text{Gpc}^{-3}\,\text{yr}^{-1}$ for compact binary coalescences with properties similar to the source of GW230529_181500; assuming that the source is a neutron star-black hole merger, GW230529_181500-like sources constitute about 60% of the total merger rate inferred for neutron star-black hole coalescences. The discovery of this system implies an increase in the expected rate of neutron star-black hole mergers with electromagnetic counterparts and provides further evidence for compact objects existing within the purported lower mass gap. △ Less

Submitted 26 July, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: 45 pages (10 pages author list, 13 pages main text, 1 page acknowledgements, 13 pages appendices, 8 pages bibliography), 17 figures, 16 tables. Update to match version published in The Astrophysical Journal Letters. Data products available from https://zenodo.org/records/10845779

Report number: LIGO-P2300352

Journal ref: ApJL 970, L34 (2024)

arXiv:2404.00710 [pdf, other]

Unknown Prompt, the only Lacuna: Unveiling CLIP's Potential for Open Domain Generalization

Authors: Mainak Singha, Ankit Jha, Shirsha Bose, Ashwin Nair, Moloud Abdar, Biplab Banerjee

Abstract: We delve into Open Domain Generalization (ODG), marked by domain and category shifts between training's labeled source and testing's unlabeled target domains. Existing solutions to ODG face limitations due to constrained generalizations of traditional CNN backbones and errors in detecting target open samples in the absence of prior knowledge. Addressing these pitfalls, we introduce ODG-CLIP, harne… ▽ More We delve into Open Domain Generalization (ODG), marked by domain and category shifts between training's labeled source and testing's unlabeled target domains. Existing solutions to ODG face limitations due to constrained generalizations of traditional CNN backbones and errors in detecting target open samples in the absence of prior knowledge. Addressing these pitfalls, we introduce ODG-CLIP, harnessing the semantic prowess of the vision-language model, CLIP. Our framework brings forth three primary innovations: Firstly, distinct from prevailing paradigms, we conceptualize ODG as a multi-class classification challenge encompassing both known and novel categories. Central to our approach is modeling a unique prompt tailored for detecting unknown class samples, and to train this, we employ a readily accessible stable diffusion model, elegantly generating proxy images for the open class. Secondly, aiming for domain-tailored classification (prompt) weights while ensuring a balance of precision and simplicity, we devise a novel visual stylecentric prompt learning mechanism. Finally, we infuse images with class-discriminative knowledge derived from the prompt space to augment the fidelity of CLIP's visual embeddings. We introduce a novel objective to safeguard the continuity of this infused semantic intel across domains, especially for the shared classes. Through rigorous testing on diverse datasets, covering closed and open-set DG contexts, ODG-CLIP demonstrates clear supremacy, consistently outpacing peers with performance boosts between 8%-16%. Code will be available at https://github.com/mainaksingha01/ODG-CLIP. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Comments: Accepted in CVPR 2024

arXiv:2403.18454 [pdf, other]

Scaling Vision-and-Language Navigation With Offline RL

Authors: Valay Bundele, Mahesh Bhupati, Biplab Banerjee, Aditya Grover

Abstract: The study of vision-and-language navigation (VLN) has typically relied on expert trajectories, which may not always be available in real-world situations due to the significant effort required to collect them. On the other hand, existing approaches to training VLN agents that go beyond available expert data involve data augmentations or online exploration which can be tedious and risky. In contras… ▽ More The study of vision-and-language navigation (VLN) has typically relied on expert trajectories, which may not always be available in real-world situations due to the significant effort required to collect them. On the other hand, existing approaches to training VLN agents that go beyond available expert data involve data augmentations or online exploration which can be tedious and risky. In contrast, it is easy to access large repositories of suboptimal offline trajectories. Inspired by research in offline reinforcement learning (ORL), we introduce a new problem setup of VLN-ORL which studies VLN using suboptimal demonstration data. We introduce a simple and effective reward-conditioned approach that can account for dataset suboptimality for training VLN agents, as well as benchmarks to evaluate progress and promote research in this area. We empirically study various noise models for characterizing dataset suboptimality among other unique challenges in VLN-ORL and instantiate it for the VLN$\circlearrowright$BERT and MTVM architectures in the R2R and RxR environments. Our experiments demonstrate that the proposed reward-conditioned approach leads to significant performance improvements, even in complex and intricate environments. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: Published in Transactions on Machine Learning Research (04/2024)

arXiv:2403.12491 [pdf, ps, other]

A consistent test of spherical symmetry for multivariate and high-dimensional data via data augmentation

Authors: Bilol Banerjee, Anil K. Ghosh

Abstract: We develop a test for spherical symmetry of a multivariate distribution $P$ that works even when the dimension of the data $d$ is larger than the sample size $n$. We propose a non-negative measure $ζ(P)$ such that $ζ(P)=0$ if and only if $P$ is spherically symmetric. We construct a consistent estimator of $ζ(P)$ using the data augmentation method and investigate its large sample properties. The pr… ▽ More We develop a test for spherical symmetry of a multivariate distribution $P$ that works even when the dimension of the data $d$ is larger than the sample size $n$. We propose a non-negative measure $ζ(P)$ such that $ζ(P)=0$ if and only if $P$ is spherically symmetric. We construct a consistent estimator of $ζ(P)$ using the data augmentation method and investigate its large sample properties. The proposed test based on this estimator is calibrated using a novel resampling algorithm. Our test controls the Type-I error, and it is consistent against general alternatives. We also study its behaviour for a sequence of alternatives $(1-δ_n) F+δ_n G$, where $ζ(G)=0$ but $ζ(F)>0$, and $δ_n \in [0,1]$. When $\lim\supδ_n<1$, for any $G$, the power of our test converges to unity as $n$ increases. However, if $\lim\supδ_n=1$, the asymptotic power of our test depends on $\lim n(1-δ_n)^2$. We establish this by proving the minimax rate optimality of our test over a suitable class of alternatives and showing that it is Pitman efficient when $\lim n(1-δ_n)^2>0$. Moreover, our test is provably consistent for high-dimensional data even when $d$ is larger than $n$. Our numerical results amply demonstrate the superiority of the proposed test over some state-of-the-art methods. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.03004 [pdf, other]

Ultralight vector dark matter search using data from the KAGRA O3GK run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi , et al. (1778 additional authors not shown)

Abstract: Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese… ▽ More Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we present the result of a search for $U(1)_{B-L}$ gauge boson DM using the KAGRA data from auxiliary length channels during the first joint observation run together with GEO600. By applying our search pipeline, which takes into account the stochastic nature of ultralight DM, upper bounds on the coupling strength between the $U(1)_{B-L}$ gauge boson and ordinary matter are obtained for a range of DM masses. While our constraints are less stringent than those derived from previous experiments, this study demonstrates the applicability of our method to the lower-mass vector DM search, which is made difficult in this measurement by the short observation time compared to the auto-correlation time scale of DM. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 20 pages, 5 figures

Report number: LIGO-P2300250

arXiv:2403.00508 [pdf, other]

Changepoint problem with angular data using a measure of variation based on the intrinsic geometry of torus

Authors: Surojit Biswas, Buddhananda Banerjee, Arnab Kumar Laha

Abstract: In many temporally ordered data sets, it is observed that the parameters of the underlying distribution change abruptly at unknown times. The detection of such changepoints is important for many applications. While this problem has been studied substantially in the linear data setup, not much work has been done for angular data. In this article, we utilize the intrinsic geometry of a torus to intr… ▽ More In many temporally ordered data sets, it is observed that the parameters of the underlying distribution change abruptly at unknown times. The detection of such changepoints is important for many applications. While this problem has been studied substantially in the linear data setup, not much work has been done for angular data. In this article, we utilize the intrinsic geometry of a torus to introduce the notion of the `square of an angle' and use it to propose a new measure of variation, called the `curved variance', of an angular random variable. Using the above ideas, we propose new tests for the existence of changepoint(s) in the concentration, mean direction, and/or both of these. The limiting distributions of the test statistics are derived and their powers are obtained using extensive simulation. It is seen that the tests have better power than the corresponding existing tests. The proposed methods have been implemented on three real-life data sets revealing interesting insights. In particular, our method when used to detect simultaneous changes in mean direction and concentration for hourly wind direction measurements of the cyclonic storm `Amphan' identified changepoints that could be associated with important meteorological events. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.11182 [pdf, other]

A uGMRT search for radio emission from planets around evolved stars

Authors: Mayank Narang, P. Manoj, Ishwara Chandra, Bihan Banerjee, Himanshu Tyagi, Motohide Tamura, Thomas Henning, Blesson Mathew, Joseph Lazio, Arun Surya, Prasanta K. Nayak

Abstract: In this work, we present the results from a study using the Giant Meterwave Radio Telescope (GMRT) to search for radio {emission} from planets around three evolved stars namely $α$~Tau, $β$~UMi, and $β$~Gem. Both $α$~Tau and $β$~UMi host massive $\sim$ 6 $M_J$ mass planets at about $\sim$1.4 au from the central star, while $β$~Gem is host to a 2.9 $M_J$ mass planet at 1.7 au from the host star. We… ▽ More In this work, we present the results from a study using the Giant Meterwave Radio Telescope (GMRT) to search for radio {emission} from planets around three evolved stars namely $α$~Tau, $β$~UMi, and $β$~Gem. Both $α$~Tau and $β$~UMi host massive $\sim$ 6 $M_J$ mass planets at about $\sim$1.4 au from the central star, while $β$~Gem is host to a 2.9 $M_J$ mass planet at 1.7 au from the host star. We observe $α$~Tau and $β$~ UMi at two u(upgraded)GMRT bands; band~3 (250-500~MHz) and band~4 (550-900~MHz). We also analyzed the archival observations from $β$ Gem at 150~MHz from GMRT. We did not detect any radio signals from these systems. At 400~MHz, the 3$σ$ upper limit is 87 $μ$Jy/beam for $α$~Tau~{b} and 77.4 $μ$Jy/beam for $β$~UMi~{b}. From our observations at 650~MHz, we place a 3$σ$ upper limit of 28.2 $μ$Jy/beam for $α$~Tau~b and 33.6 $μ$Jy/beam for $β$~UMi~b. For $β$ Gem b, at 150~MHz, we place an upper limit of 2.5 mJy. At 400~MHz and 650~MHz, our observations are the deepest radio images for any exoplanetary system. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: 9 pages, 6 figures, accepted at MNRAS

arXiv:2402.00295 [pdf]

Comparative Evaluation of Traditional and Deep Learning-Based Segmentation Methods for Spoil Pile Delineation Using UAV Images

Authors: Sureka Thiruchittampalam, Bikram P. Banerjee, Nancy F. Glenn, Simit Raval

Abstract: The stability of mine dumps is contingent upon the precise arrangement of spoil piles, taking into account their geological and geotechnical attributes. Yet, on-site characterisation of individual piles poses a formidable challenge. The utilisation of image-based techniques for spoil pile characterisation, employing remotely acquired data through unmanned aerial systems, is a promising complementa… ▽ More The stability of mine dumps is contingent upon the precise arrangement of spoil piles, taking into account their geological and geotechnical attributes. Yet, on-site characterisation of individual piles poses a formidable challenge. The utilisation of image-based techniques for spoil pile characterisation, employing remotely acquired data through unmanned aerial systems, is a promising complementary solution. Image processing, such as object-based classification and feature extraction, are dependent upon effective segmentation. This study refines and juxtaposes various segmentation approaches, specifically colour-based and morphology-based techniques. The objective is to enhance and evaluate avenues for object-based analysis for spoil characterisation within the context of mining environments. Furthermore, a comparative analysis is conducted between conventional segmentation approaches and those rooted in deep learning methodologies. Among the diverse segmentation approaches evaluated, the morphology-based deep learning segmentation approach, Segment Anything Model (SAM), exhibited superior performance in comparison to other approaches. This outcome underscores the efficacy of incorporating advanced morphological and deep learning techniques for accurate and efficient spoil pile characterisation. The findings of this study contribute valuable insights to the optimisation of segmentation strategies, thereby advancing the application of image-based techniques for the characterisation of spoil piles in mining environments. △ Less

Submitted 31 January, 2024; originally announced February 2024.

arXiv:2402.00170 [pdf]

An Evaluation of Calibrated and Uncalibrated High-Resolution RGB Data in Time Series Analysis for Coal Spoil Characterisation: A Comparative Study

Authors: Sureka Thiruchittampalam, Bikram Pratap Banerjee, Nancy F Glenn, Simit Raval

Abstract: Minor errors in the spoil deposition process, such as placing stronger materials with higher shear strength over weaker ones, can lead to potential dump failure. Irregular deposition and inadequate compaction complicate coal spoil behaviour, necessitating a robust methodology for temporal monitoring. This study explores using unmanned aerial vehicles (UAV) equipped with red-green-blue (RGB) sensor… ▽ More Minor errors in the spoil deposition process, such as placing stronger materials with higher shear strength over weaker ones, can lead to potential dump failure. Irregular deposition and inadequate compaction complicate coal spoil behaviour, necessitating a robust methodology for temporal monitoring. This study explores using unmanned aerial vehicles (UAV) equipped with red-green-blue (RGB) sensors for efficient data acquisition. Despite their prevalence, raw UAV data exhibit temporal inconsistency, hindering accurate assessments of changes over time. This is attributed to radiometric errors in UAV-based sensing arising from factors such as sensor noise, atmospheric scattering and absorption, variations in sun parameters, and variable characteristics of the sensed object over time. To this end, the study introduces an empirical line calibration with invariant targets, for precise calibration across diverse scenes. Calibrated RGB data exhibit a substantial performance advantage, achieving a 90.7% overall accuracy for spoil pile classification using ensemble (subspace discriminant), representing a noteworthy 7% improvement compared to classifying uncalibrated data. The study highlights the critical role of data calibration in optimising UAV effectiveness for spatio-temporal mine dump monitoring. The developed calibration workflow proves robust and reliable across multiple dates. Consequently, these findings play a crucial role in informing and refining sustainable management practices within the domain of mine waste management. △ Less

Submitted 31 January, 2024; originally announced February 2024.

arXiv:2311.15812 [pdf, other]

C-SAW: Self-Supervised Prompt Learning for Image Generalization in Remote Sensing

Authors: Avigyan Bhattacharya, Mainak Singha, Ankit Jha, Biplab Banerjee

Abstract: We focus on domain and class generalization problems in analyzing optical remote sensing images, using the large-scale pre-trained vision-language model (VLM), CLIP. While contrastively trained VLMs show impressive zero-shot generalization performance, their effectiveness is limited when dealing with diverse domains during training and testing. Existing prompt learning techniques overlook the impo… ▽ More We focus on domain and class generalization problems in analyzing optical remote sensing images, using the large-scale pre-trained vision-language model (VLM), CLIP. While contrastively trained VLMs show impressive zero-shot generalization performance, their effectiveness is limited when dealing with diverse domains during training and testing. Existing prompt learning techniques overlook the importance of incorporating domain and content information into the prompts, which results in a drop in performance while dealing with such multi-domain data. To address these challenges, we propose a solution that ensures domain-invariant prompt learning while enhancing the expressiveness of visual features. We observe that CLIP's vision encoder struggles to identify contextual image information, particularly when image patches are jumbled up. This issue is especially severe in optical remote sensing images, where land-cover classes exhibit well-defined contextual appearances. To this end, we introduce C-SAW, a method that complements CLIP with a self-supervised loss in the visual space and a novel prompt learning technique that emphasizes both visual domain and content-specific features. We keep the CLIP backbone frozen and introduce a small set of projectors for both the CLIP encoders to train C-SAW contrastively. Experimental results demonstrate the superiority of C-SAW across multiple remote sensing benchmarks and different generalization tasks. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: Accepted in ACM ICVGIP 2023

arXiv:2311.11781 [pdf, other]

doi 10.1007/s12036-023-09972-6

Identifying the population of T-Tauri stars in Taurus: UV-optical synergy

Authors: Prasanta K. Nayak, Mayank Narang, Manoj Puravankara, Himanshu Tyagi, Bihan Banerjee, Saurabh Sharma, Rakesh Pandey, Arun Surya, Blesson Mathew, R. Arun, K. Ujjwal, Sreeja S. Kartha

Abstract: With the third data release of the Gaia mission $Gaia$ DR3 with its precise photometry and astrometry, it is now possible to study the behaviour of stars at a scale never seen before. In this paper, we developed new criteria to identify T-Tauri stars (TTS) candidates using UV and optical CMDs by combining the GALEX and Gaia surveys. We found 19 TTS candidates and 5 of them are newly identified TTS… ▽ More With the third data release of the Gaia mission $Gaia$ DR3 with its precise photometry and astrometry, it is now possible to study the behaviour of stars at a scale never seen before. In this paper, we developed new criteria to identify T-Tauri stars (TTS) candidates using UV and optical CMDs by combining the GALEX and Gaia surveys. We found 19 TTS candidates and 5 of them are newly identified TTS in the Taurus Molecular Cloud (TMC), not catalogued before as TMC members. For some of the TTS candidates, we also obtained optical spectra from several Indian telescopes. We also present the analysis of the distance and proper motion of young stars in the Taurus using data from $Gaia$ DR3. We found that the stars in Taurus show a bimodal distribution with distance, having peaks at $130.17_{-1.24}^{1.31}$ pc and $156.25_{-5.00}^{1.86}$ pc. The reason for this bimodality, we think, is due to the fact that different clouds in the TMC region are at different distances. We further show that the two populations have similar ages and proper motion distribution. Using the $Gaia$ DR3 colour-magnitude diagram, we show that the age of Taurus is consistent with 1 Myr. △ Less

Submitted 20 November, 2023; originally announced November 2023.

Comments: 13 pages, 10 figures

Journal ref: Journal of Astrophysics and Astronomy, 2023, Volume 44, Issue 2, article id.83

arXiv:2311.06598 [pdf, other]

Effects of bursty synthesis in organelle biogenesis

Authors: Binayak Banerjee, Dipjyoti Das

Abstract: A fundamental question of cell biology is how cells control the number of organelles. The processes of organelle biogenesis, namely de novo synthesis, fission, fusion, and decay, are inherently stochastic, producing cell-to-cell variability in organelle abundance. In addition, experiments suggest that the synthesis of some organelles can be bursty. We thus ask how bursty synthesis impacts intracel… ▽ More A fundamental question of cell biology is how cells control the number of organelles. The processes of organelle biogenesis, namely de novo synthesis, fission, fusion, and decay, are inherently stochastic, producing cell-to-cell variability in organelle abundance. In addition, experiments suggest that the synthesis of some organelles can be bursty. We thus ask how bursty synthesis impacts intracellular organelle number distribution. We develop an organelle biogenesis model with bursty de novo synthesis by considering geometrically distributed burst sizes. We analytically solve the model in biologically relevant limits and provide exact expressions for the steady-state organelle number distributions and their means and variances. We also present approximate solutions for the whole model, complementing with exact stochastic simulations. We show that bursts generally increase the noise in organelle numbers, producing distinct signatures in noise profiles depending on different mechanisms of organelle biogenesis. We also find different shapes of organelle number distributions, including bimodal distributions in some parameter regimes. Notably, bursty synthesis broadens the parameter regime of observing bimodality compared to the `non-bursty' case. Together, our framework utilizes number fluctuations to elucidate the role of bursty synthesis in producing organelle number heterogeneity in cells. △ Less

Submitted 9 February, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

Comments: 41 pages, 3 main figures, 4 supplemental figures; Accepted in Mathematical Biosciences

arXiv:2311.02599 [pdf, other]

Learning Class and Domain Augmentations for Single-Source Open-Domain Generalization

Authors: Prathmesh Bele, Valay Bundele, Avigyan Bhattacharya, Ankit Jha, Gemma Roig, Biplab Banerjee

Abstract: Single-source open-domain generalization (SS-ODG) addresses the challenge of labeled source domains with supervision during training and unlabeled novel target domains during testing. The target domain includes both known classes from the source domain and samples from previously unseen classes. Existing techniques for SS-ODG primarily focus on calibrating source-domain classifiers to identify ope… ▽ More Single-source open-domain generalization (SS-ODG) addresses the challenge of labeled source domains with supervision during training and unlabeled novel target domains during testing. The target domain includes both known classes from the source domain and samples from previously unseen classes. Existing techniques for SS-ODG primarily focus on calibrating source-domain classifiers to identify open samples in the target domain. However, these methods struggle with visually fine-grained open-closed data, often misclassifying open samples as closed-set classes. Moreover, relying solely on a single source domain restricts the model's ability to generalize. To overcome these limitations, we propose a novel framework called SODG-Net that simultaneously synthesizes novel domains and generates pseudo-open samples using a learning-based objective, in contrast to the ad-hoc mixing strategies commonly found in the literature. Our approach enhances generalization by diversifying the styles of known class samples using a novel metric criterion and generates diverse pseudo-open samples to train a unified and confident multi-class classifier capable of handling both open and closed-set data. Extensive experimental evaluations conducted on multiple benchmarks consistently demonstrate the superior performance of SODG-Net compared to the literature. △ Less

Submitted 5 November, 2023; originally announced November 2023.

Comments: 11 pages, WACV 2024

arXiv:2310.00828 [pdf, ps, other]

A Model for Calculating Cost of Applying Electronic Governance and Robotic Process Automation to a Distributed Management System

Authors: Bonny Banerjee, Saurabh Pahune

Abstract: Electronic Governance (eGov) and Robotic Process Automation (RPA) are two technological advancements that have the potential to revolutionize the way organizations manage their operations. When applied to Distributed Management (DM), these technologies can further enhance organizational efficiency and effectiveness. In this brief article, we present a mathematical model for calculating the cost of… ▽ More Electronic Governance (eGov) and Robotic Process Automation (RPA) are two technological advancements that have the potential to revolutionize the way organizations manage their operations. When applied to Distributed Management (DM), these technologies can further enhance organizational efficiency and effectiveness. In this brief article, we present a mathematical model for calculating the cost of accomplishing a task by applying eGov and RPA in a DM system. This model is one of the first of its kind, and is expected to spark further research on cost analysis for organizational efficiency given the unprecedented advancements in electronic and automation technologies. △ Less

Submitted 1 October, 2023; originally announced October 2023.

arXiv:2309.13470 [pdf, other]

HAVE-Net: Hallucinated Audio-Visual Embeddings for Few-Shot Classification with Unimodal Cues

Authors: Ankit Jha, Debabrata Pal, Mainak Singha, Naman Agarwal, Biplab Banerjee

Abstract: Recognition of remote sensing (RS) or aerial images is currently of great interest, and advancements in deep learning algorithms added flavor to it in recent years. Occlusion, intra-class variance, lighting, etc., might arise while training neural networks using unimodal RS visual input. Even though joint training of audio-visual modalities improves classification performance in a low-data regime,… ▽ More Recognition of remote sensing (RS) or aerial images is currently of great interest, and advancements in deep learning algorithms added flavor to it in recent years. Occlusion, intra-class variance, lighting, etc., might arise while training neural networks using unimodal RS visual input. Even though joint training of audio-visual modalities improves classification performance in a low-data regime, it has yet to be thoroughly investigated in the RS domain. Here, we aim to solve a novel problem where both the audio and visual modalities are present during the meta-training of a few-shot learning (FSL) classifier; however, one of the modalities might be missing during the meta-testing stage. This problem formulation is pertinent in the RS domain, given the difficulties in data acquisition or sensor malfunctioning. To mitigate, we propose a novel few-shot generative framework, Hallucinated Audio-Visual Embeddings-Network (HAVE-Net), to meta-train cross-modal features from limited unimodal data. Precisely, these hallucinated features are meta-learned from base classes and used for few-shot classification on novel classes during the inference phase. The experimental results on the benchmark ADVANCE and AudioSetZSL datasets show that our hallucinated modality augmentation strategy for few-shot classification outperforms the classifier performance trained with the real multimodal information at least by 0.8-2%. △ Less

Submitted 23 September, 2023; originally announced September 2023.

Comments: 8 Page, 2 Figures, 2 Tables, Accepted in Adapting to Change: Reliable Multimodal Learning Across Domains Workshop, ECML PKDD 2023

arXiv:2309.12814 [pdf, other]

Domain Adaptive Few-Shot Open-Set Learning

Authors: Debabrata Pal, Deeptej More, Sai Bhargav, Dipesh Tamboli, Vaneet Aggarwal, Biplab Banerjee

Abstract: Few-shot learning has made impressive strides in addressing the crucial challenges of recognizing unknown samples from novel classes in target query sets and managing visual shifts between domains. However, existing techniques fall short when it comes to identifying target outliers under domain shifts by learning to reject pseudo-outliers from the source domain, resulting in an incomplete solution… ▽ More Few-shot learning has made impressive strides in addressing the crucial challenges of recognizing unknown samples from novel classes in target query sets and managing visual shifts between domains. However, existing techniques fall short when it comes to identifying target outliers under domain shifts by learning to reject pseudo-outliers from the source domain, resulting in an incomplete solution to both problems. To address these challenges comprehensively, we propose a novel approach called Domain Adaptive Few-Shot Open Set Recognition (DA-FSOS) and introduce a meta-learning-based architecture named DAFOSNET. During training, our model learns a shared and discriminative embedding space while creating a pseudo open-space decision boundary, given a fully-supervised source domain and a label-disjoint few-shot target domain. To enhance data density, we use a pair of conditional adversarial networks with tunable noise variances to augment both domains closed and pseudo-open spaces. Furthermore, we propose a domain-specific batch-normalized class prototypes alignment strategy to align both domains globally while ensuring class-discriminativeness through novel metric objectives. Our training approach ensures that DAFOS-NET can generalize well to new scenarios in the target domain. We present three benchmarks for DA-FSOS based on the Office-Home, mini-ImageNet/CUB, and DomainNet datasets and demonstrate the efficacy of DAFOS-NET through extensive experimentation △ Less

Submitted 22 September, 2023; originally announced September 2023.

Journal ref: ICCV 2023

arXiv:2309.01050 [pdf, other]

Efficient Curriculum based Continual Learning with Informative Subset Selection for Remote Sensing Scene Classification

Authors: S Divakar Bhat, Biplab Banerjee, Subhasis Chaudhuri, Avik Bhattacharya

Abstract: We tackle the problem of class incremental learning (CIL) in the realm of landcover classification from optical remote sensing (RS) images in this paper. The paradigm of CIL has recently gained much prominence given the fact that data are generally obtained in a sequential manner for real-world phenomenon. However, CIL has not been extensively considered yet in the domain of RS irrespective of the… ▽ More We tackle the problem of class incremental learning (CIL) in the realm of landcover classification from optical remote sensing (RS) images in this paper. The paradigm of CIL has recently gained much prominence given the fact that data are generally obtained in a sequential manner for real-world phenomenon. However, CIL has not been extensively considered yet in the domain of RS irrespective of the fact that the satellites tend to discover new classes at different geographical locations temporally. With this motivation, we propose a novel CIL framework inspired by the recent success of replay-memory based approaches and tackling two of their shortcomings. In order to reduce the effect of catastrophic forgetting of the old classes when a new stream arrives, we learn a curriculum of the new classes based on their similarity with the old classes. This is found to limit the degree of forgetting substantially. Next while constructing the replay memory, instead of randomly selecting samples from the old streams, we propose a sample selection strategy which ensures the selection of highly confident samples so as to reduce the effects of noise. We observe a sharp improvement in the CIL performance with the proposed components. Experimental results on the benchmark NWPU-RESISC45, PatternNet, and EuroSAT datasets confirm that our method offers improved stability-plasticity trade-off than the literature. △ Less

Submitted 2 September, 2023; originally announced September 2023.

arXiv:2308.13666 [pdf, other]

A Joint Fermi-GBM and Swift-BAT Analysis of Gravitational-Wave Candidates from the Third Gravitational-wave Observing Run

Authors: C. Fletcher, J. Wood, R. Hamburg, P. Veres, C. M. Hui, E. Bissaldi, M. S. Briggs, E. Burns, W. H. Cleveland, M. M. Giles, A. Goldstein, B. A. Hristov, D. Kocevski, S. Lesage, B. Mailyan, C. Malacaria, S. Poolakkil, A. von Kienlin, C. A. Wilson-Hodge, The Fermi Gamma-ray Burst Monitor Team, M. Crnogorčević, J. DeLaunay, A. Tohuvavohu, R. Caputo, S. B. Cenko , et al. (1674 additional authors not shown)

Abstract: We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses,… ▽ More We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses, the Targeted Search and the Untargeted Search, we investigate whether there are any coincident GRBs associated with the GWs. We also search the Swift-BAT rate data around the GW times to determine whether a GRB counterpart is present. No counterparts are found. Using both the Fermi-GBM Targeted Search and the Swift-BAT search, we calculate flux upper limits and present joint upper limits on the gamma-ray luminosity of each GW. Given these limits, we constrain theoretical models for the emission of gamma-rays from binary black hole mergers. △ Less

Submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.12689 [pdf, other]

Optical spectroscopy of Gaia detected protostars with DOT: can we probe protostellar photospheres?

Authors: Mayank Narang, Manoj Puravankara, Himanshu Tyagi, Prasanta K. Nayak, Saurabh Sharma, Arun Surya, Bihan Banerjee, Blesson Mathew, Arpan Ghosh, Aayushi Verma

Abstract: Optical spectroscopy offers the most direct view of the stellar properties and the accretion indicators. Standard accretion tracers, such as $Hβ$, $Hα$, and, Ca II triplet lines, and most photospheric features, fall in the optical wavelengths. However, these tracers are not readily observable from deeply embedded protostars because of the large line of sight extinction (Av $\sim$ 50-100 mag) towar… ▽ More Optical spectroscopy offers the most direct view of the stellar properties and the accretion indicators. Standard accretion tracers, such as $Hβ$, $Hα$, and, Ca II triplet lines, and most photospheric features, fall in the optical wavelengths. However, these tracers are not readily observable from deeply embedded protostars because of the large line of sight extinction (Av $\sim$ 50-100 mag) toward them. In some cases, however, it is possible to observe protostars at optical wavelengths if the outflow cavity is aligned along the line-of-sight that allows observations of the photosphere, or the envelope is very tenuous and thin such that the extinction is low. In such cases, we can not only detect these protostars at optical wavelengths but also follow up spectroscopically. We have used the HOPS catalog (Furlan et al. 2016) of protostars in Orion to search for optical counterparts for protostars in the Gaia DR3 survey. Out of the 330 protostars in the HOPS sample, an optical counterpart within 2" is detected for 62 of the protostars. For 17 out of 62 optically detected protostars, we obtained optical spectra { (between 5500 to 8900 $Å$) using the Aries-Devasthal Faint Object Spectrograph \& Camera (ADFOSC) on the 3.6-m Devasthal Optical Telescope (DOT) and Hanle Faint Object Spectrograph Camera (HFOSC) on 2-m Himalayan Chandra Telescope (HCT)}. We detect strong photospheric features, such as the TiO bands in the spectra {(of 4 protostars)}, hinting that photospheres can form early on in the star formation process. We further determined the spectral types of protostars, which show photospheres similar to a late M-type. Mass accretion rates derived for the protostars are similar to those found for T-Tauri stars, in the range of 10$^{-7}$ to 10$^{-8}$ $M_\odot$/yr. △ Less

Submitted 24 August, 2023; originally announced August 2023.

Comments: 9 pages, 5 figures accepted in Journal of Astrophysics and Astronomy as part of the "Star formation studies in the context of NIR instruments on 3.6m DOT" special issue

arXiv:2308.11605 [pdf, other]

GOPro: Generate and Optimize Prompts in CLIP using Self-Supervised Learning

Authors: Mainak Singha, Ankit Jha, Biplab Banerjee

Abstract: Large-scale foundation models, such as CLIP, have demonstrated remarkable success in visual recognition tasks by embedding images in a semantically rich space. Self-supervised learning (SSL) has also shown promise in improving visual recognition by learning invariant features. However, the combination of CLIP with SSL is found to face challenges due to the multi-task framework that blends CLIP's c… ▽ More Large-scale foundation models, such as CLIP, have demonstrated remarkable success in visual recognition tasks by embedding images in a semantically rich space. Self-supervised learning (SSL) has also shown promise in improving visual recognition by learning invariant features. However, the combination of CLIP with SSL is found to face challenges due to the multi-task framework that blends CLIP's contrastive loss and SSL's loss, including difficulties with loss weighting and inconsistency among different views of images in CLIP's output space. To overcome these challenges, we propose a prompt learning-based model called GOPro, which is a unified framework that ensures similarity between various augmented views of input images in a shared image-text embedding space, using a pair of learnable image and text projectors atop CLIP, to promote invariance and generalizability. To automatically learn such prompts, we leverage the visual content and style primitives extracted from pre-trained CLIP and adapt them to the target task. In addition to CLIP's cross-domain contrastive loss, we introduce a visual contrastive loss and a novel prompt consistency loss, considering the different views of the images. GOPro is trained end-to-end on all three loss objectives, combining the strengths of CLIP and SSL in a principled manner. Empirical evaluations demonstrate that GOPro outperforms the state-of-the-art prompting techniques on three challenging domain generalization tasks across multiple benchmarks by a significant margin. Our code is available at https://github.com/mainaksingha01/GOPro. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Comments: Accepted at BMVC 2023

arXiv:2308.05659 [pdf, other]

AD-CLIP: Adapting Domains in Prompt Space Using CLIP

Authors: Mainak Singha, Harsh Pal, Ankit Jha, Biplab Banerjee

Abstract: Although deep learning models have shown impressive performance on supervised learning tasks, they often struggle to generalize well when the training (source) and test (target) domains differ. Unsupervised domain adaptation (DA) has emerged as a popular solution to this problem. However, current DA techniques rely on visual backbones, which may lack semantic richness. Despite the potential of lar… ▽ More Although deep learning models have shown impressive performance on supervised learning tasks, they often struggle to generalize well when the training (source) and test (target) domains differ. Unsupervised domain adaptation (DA) has emerged as a popular solution to this problem. However, current DA techniques rely on visual backbones, which may lack semantic richness. Despite the potential of large-scale vision-language foundation models like CLIP, their effectiveness for DA has yet to be fully explored. To address this gap, we introduce AD-CLIP, a domain-agnostic prompt learning strategy for CLIP that aims to solve the DA problem in the prompt space. We leverage the frozen vision backbone of CLIP to extract both image style (domain) and content information, which we apply to learn prompt tokens. Our prompts are designed to be domain-invariant and class-generalizable, by conditioning prompt learning on image style and content features simultaneously. We use standard supervised contrastive learning in the source domain, while proposing an entropy minimization strategy to align domains in the embedding space given the target domain data. We also consider a scenario where only target domain samples are available during testing, without any source domain data, and propose a cross-domain style mapping network to hallucinate domain-agnostic tokens. Our extensive experiments on three benchmark DA datasets demonstrate the effectiveness of AD-CLIP compared to existing literature. △ Less

Submitted 10 August, 2023; originally announced August 2023.

Comments: 10 pages, 8 figures, 4 tables. Accepted at OOD-CV, ICCV Workshop, 2023

arXiv:2308.04589 [pdf, other]

Temporal DINO: A Self-supervised Video Strategy to Enhance Action Prediction

Authors: Izzeddin Teeti, Rongali Sai Bhargav, Vivek Singh, Andrew Bradley, Biplab Banerjee, Fabio Cuzzolin

Abstract: The emerging field of action prediction plays a vital role in various computer vision applications such as autonomous driving, activity analysis and human-computer interaction. Despite significant advancements, accurately predicting future actions remains a challenging problem due to high dimensionality, complex dynamics and uncertainties inherent in video data. Traditional supervised approaches r… ▽ More The emerging field of action prediction plays a vital role in various computer vision applications such as autonomous driving, activity analysis and human-computer interaction. Despite significant advancements, accurately predicting future actions remains a challenging problem due to high dimensionality, complex dynamics and uncertainties inherent in video data. Traditional supervised approaches require large amounts of labelled data, which is expensive and time-consuming to obtain. This paper introduces a novel self-supervised video strategy for enhancing action prediction inspired by DINO (self-distillation with no labels). The Temporal-DINO approach employs two models; a 'student' processing past frames; and a 'teacher' processing both past and future frames, enabling a broader temporal context. During training, the teacher guides the student to learn future context by only observing past frames. The strategy is evaluated on ROAD dataset for the action prediction downstream task using 3D-ResNet, Transformer, and LSTM architectures. The experimental results showcase significant improvements in prediction performance across these architectures, with our method achieving an average enhancement of 9.9% Precision Points (PP), highlighting its effectiveness in enhancing the backbones' capabilities of capturing long-term dependencies. Furthermore, our approach demonstrates efficiency regarding the pretraining dataset size and the number of epochs required. This method overcomes limitations present in other approaches, including considering various backbone architectures, addressing multiple prediction horizons, reducing reliance on hand-crafted augmentations, and streamlining the pretraining process into a single stage. These findings highlight the potential of our approach in diverse video-based tasks such as activity recognition, motion planning, and scene understanding. △ Less

Submitted 20 August, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

arXiv:2308.03822 [pdf, other]

Search for Eccentric Black Hole Coalescences during the Third Observing Run of LIGO and Virgo

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1750 additional authors not shown)

Abstract: Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effect… ▽ More Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effects of eccentricity. Here, we present observational results for a waveform-independent search sensitive to eccentric black hole coalescences, covering the third observing run (O3) of the LIGO and Virgo detectors. We identified no new high-significance candidates beyond those that were already identified with searches focusing on quasi-circular binaries. We determine the sensitivity of our search to high-mass (total mass $M>70$ $M_\odot$) binaries covering eccentricities up to 0.3 at 15 Hz orbital frequency, and use this to compare model predictions to search results. Assuming all detections are indeed quasi-circular, for our fiducial population model, we place an upper limit for the merger rate density of high-mass binaries with eccentricities $0 < e \leq 0.3$ at $0.33$ Gpc$^{-3}$ yr$^{-1}$ at 90\% confidence level. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: 24 pages, 5 figures

Report number: LIGO-P2300080

arXiv:2307.14570 [pdf, other]

Physically Plausible 3D Human-Scene Reconstruction from Monocular RGB Image using an Adversarial Learning Approach

Authors: Sandika Biswas, Kejie Li, Biplab Banerjee, Subhasis Chaudhuri, Hamid Rezatofighi

Abstract: Holistic 3D human-scene reconstruction is a crucial and emerging research area in robot perception. A key challenge in holistic 3D human-scene reconstruction is to generate a physically plausible 3D scene from a single monocular RGB image. The existing research mainly proposes optimization-based approaches for reconstructing the scene from a sequence of RGB frames with explicitly defined physical… ▽ More Holistic 3D human-scene reconstruction is a crucial and emerging research area in robot perception. A key challenge in holistic 3D human-scene reconstruction is to generate a physically plausible 3D scene from a single monocular RGB image. The existing research mainly proposes optimization-based approaches for reconstructing the scene from a sequence of RGB frames with explicitly defined physical laws and constraints between different scene elements (humans and objects). However, it is hard to explicitly define and model every physical law in every scenario. This paper proposes using an implicit feature representation of the scene elements to distinguish a physically plausible alignment of humans and objects from an implausible one. We propose using a graph-based holistic representation with an encoded physical representation of the scene to analyze the human-object and object-object interactions within the scene. Using this graphical representation, we adversarially train our model to learn the feasible alignments of the scene elements from the training data itself without explicitly defining the laws and constraints between them. Unlike the existing inference-time optimization-based approaches, we use this adversarially trained model to produce a per-frame 3D reconstruction of the scene that abides by the physical laws and constraints. Our learning-based method achieves comparable 3D reconstruction quality to existing optimization-based holistic human-scene reconstruction methods and does not need inference time optimization. This makes it better suited when compared to existing methods, for potential use in robotic applications, such as robot navigation, etc. △ Less

Submitted 26 July, 2023; originally announced July 2023.

Comments: Accepted in RAL 2023

arXiv:2307.11442 [pdf, other]

doi 10.3847/1538-3881/ace782

Age distribution of exoplanet host stars: Chemical and Kinematics age proxies from GAIA DR3

Authors: C. Swastik, Ravinder K. Banyal, Mayank Narang, Athira Unni, Bihan Banerjee, P. Manoj, T. Sivarani

Abstract: The GAIA space mission is impacting astronomy in many significant ways by providing a uniform, homogeneous and precise data set for over 1 billion stars and other celestial objects in the Milky Way and beyond. Exoplanet science has greatly benefited from the unprecedented accuracy of stellar parameters obtained from GAIA. In this study, we combine photometric, astrometric, and spectroscopic data f… ▽ More The GAIA space mission is impacting astronomy in many significant ways by providing a uniform, homogeneous and precise data set for over 1 billion stars and other celestial objects in the Milky Way and beyond. Exoplanet science has greatly benefited from the unprecedented accuracy of stellar parameters obtained from GAIA. In this study, we combine photometric, astrometric, and spectroscopic data from the most recent Gaia DR3 to examine the kinematic and chemical age proxies for a large sample of 2611 exoplanets hosting stars whose parameters have been determined uniformly. Using spectroscopic data from the Radial Velocity Spectrometer (RVS) onboard GAIA, we show that stars hosting massive planets are metal-rich and $α$-poor in comparison to stars hosting small planets. The kinematic analysis of the sample reveals that the stellar systems with small planets and those with giant planets differ in key aspects of galactic space velocity and orbital parameters, which are indicative of age. We find that the galactic orbital parameters have a statistically significant difference of 0.06 kpc for $Z_{max}$ and 0.03 for eccentricity respectively. Furthermore, we estimated the stellar ages of the sample using the MIST-MESA isochrone models. The ages and its proxies for the planet-hosting stars indicate that the hosts of giant planetary systems are younger compared to the population of stars harboring small planets. These age trends are also consistent with the chemical evolution of the galaxy and the formation of giant planets from the core-accretion process. △ Less

Submitted 21 July, 2023; originally announced July 2023.

Comments: Accepted for Publication in The Astronomical Journal

arXiv:2306.14264 [pdf, other]

Visual Question Answering in Remote Sensing with Cross-Attention and Multimodal Information Bottleneck

Authors: Jayesh Songara, Shivam Pande, Shabnam Choudhury, Biplab Banerjee, Rajbabu Velmurugan

Abstract: In this research, we deal with the problem of visual question answering (VQA) in remote sensing. While remotely sensed images contain information significant for the task of identification and object detection, they pose a great challenge in their processing because of high dimensionality, volume and redundancy. Furthermore, processing image information jointly with language features adds addition… ▽ More In this research, we deal with the problem of visual question answering (VQA) in remote sensing. While remotely sensed images contain information significant for the task of identification and object detection, they pose a great challenge in their processing because of high dimensionality, volume and redundancy. Furthermore, processing image information jointly with language features adds additional constraints, such as mapping the corresponding image and language features. To handle this problem, we propose a cross attention based approach combined with information maximization. The CNN-LSTM based cross-attention highlights the information in the image and language modalities and establishes a connection between the two, while information maximization learns a low dimensional bottleneck layer, that has all the relevant information required to carry out the VQA task. We evaluate our method on two VQA remote sensing datasets of different resolutions. For the high resolution dataset, we achieve an overall accuracy of 79.11% and 73.87% for the two test sets while for the low resolution dataset, we achieve an overall accuracy of 85.98%. △ Less

Submitted 25 June, 2023; originally announced June 2023.

arXiv:2306.11588 [pdf, other]

doi 10.1088/1475-7516/2023/11/007

Cosmological coupling of nonsingular black holes

Authors: M. Cadoni, A. P. Sanna, M. Pitzalis, B. Banerjee, R. Murgia, N. Hazra, M. Branchesi

Abstract: We show that -- in the framework of general relativity (GR) -- if black holes (BHs) are singularity-free objects, they couple to the large-scale cosmological dynamics. We find that the leading contribution to the resulting growth of the BH mass ($M_{\rm BH}$) as a function of the scale factor $a$ stems from the curvature term, yielding $M_{\rm BH} \propto a^k$, with $k=1$. We demonstrate that such… ▽ More We show that -- in the framework of general relativity (GR) -- if black holes (BHs) are singularity-free objects, they couple to the large-scale cosmological dynamics. We find that the leading contribution to the resulting growth of the BH mass ($M_{\rm BH}$) as a function of the scale factor $a$ stems from the curvature term, yielding $M_{\rm BH} \propto a^k$, with $k=1$. We demonstrate that such a linear scaling is universal for spherically-symmetric objects, and it is the only contribution in the case of regular BHs. For nonsingular horizonless compact objects we instead obtain an additional subleading model-dependent term. We conclude that GR nonsingular BHs/horizonless compact objects, although cosmologically coupled, are unlikely to be the source of dark energy. We test our prediction with astrophysical data by analysing the redshift dependence of the mass growth of supermassive BHs in a sample of elliptical galaxies at redshift $z=0.8 -0.9$. We also compare our theoretical prediction with higher redshift BH mass measurements obtained with the James Webb Space Telescope (JWST). We find that, while $k=1$ is compatible within $1 σ$ with JWST results, the data from elliptical galaxies at $z=0.8 -0.9$ favour values of $k>1$. New samples of BHs covering larger mass and redshift ranges and more precise BH mass measurements are required to settle the issue. △ Less

Submitted 1 December, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

Comments: 12 pages, 2 figures, discussions/references added, matches the version published in JCAP

Journal ref: JCAP 11 (2023), 007

arXiv:2306.10955 [pdf, other]

Semi-Supervised Learning for hyperspectral images by non parametrically predicting view assignment

Authors: Shivam Pande, Nassim Ait Ali Braham, Yi Wang, Conrad M Albrecht, Biplab Banerjee, Xiao Xiang Zhu

Abstract: Hyperspectral image (HSI) classification is gaining a lot of momentum in present time because of high inherent spectral information within the images. However, these images suffer from the problem of curse of dimensionality and usually require a large number samples for tasks such as classification, especially in supervised setting. Recently, to effectively train the deep learning models with mini… ▽ More Hyperspectral image (HSI) classification is gaining a lot of momentum in present time because of high inherent spectral information within the images. However, these images suffer from the problem of curse of dimensionality and usually require a large number samples for tasks such as classification, especially in supervised setting. Recently, to effectively train the deep learning models with minimal labelled samples, the unlabeled samples are also being leveraged in self-supervised and semi-supervised setting. In this work, we leverage the idea of semi-supervised learning to assist the discriminative self-supervised pretraining of the models. The proposed method takes different augmented views of the unlabeled samples as input and assigns them the same pseudo-label corresponding to the labelled sample from the downstream task. We train our model on two HSI datasets, namely Houston dataset (from data fusion contest, 2013) and Pavia university dataset, and show that the proposed approach performs better than self-supervised approach and supervised training. △ Less

Submitted 19 June, 2023; originally announced June 2023.

Comments: The paper was submitted in IGARSS, 2023 conference and is not accepted to appear in the proceedings. The page requirement is 4 pages, including references

arXiv:2306.06717 [pdf, other]

PWR-Align: Leveraging Part-Whole Relationships for Part-wise Rigid Point Cloud Registration in Mixed Reality Applications

Authors: Manorama Jha, Bhaskar Banerjee

Abstract: We present an efficient and robust point cloud registration (PCR) workflow for part-wise rigid point cloud alignment using the Microsoft HoloLens 2. Point Cloud Registration (PCR) is an important problem in Augmented and Mixed Reality use cases, and we present a study for a special class of non-rigid transformations. Many commonly encountered objects are composed of rigid parts that move relative… ▽ More We present an efficient and robust point cloud registration (PCR) workflow for part-wise rigid point cloud alignment using the Microsoft HoloLens 2. Point Cloud Registration (PCR) is an important problem in Augmented and Mixed Reality use cases, and we present a study for a special class of non-rigid transformations. Many commonly encountered objects are composed of rigid parts that move relative to one another about joints resulting in non-rigid deformation of the whole object such as robots with manipulators, and machines with hinges. The workflow presented allows us to register the point cloud with various configurations of the point cloud. △ Less

Submitted 11 June, 2023; originally announced June 2023.

Comments: Accepted for presentation at WiCV @ CVPR 2023

arXiv:2305.17520 [pdf, other]

USIM-DAL: Uncertainty-aware Statistical Image Modeling-based Dense Active Learning for Super-resolution

Authors: Vikrant Rangnekar, Uddeshya Upadhyay, Zeynep Akata, Biplab Banerjee

Abstract: Dense regression is a widely used approach in computer vision for tasks such as image super-resolution, enhancement, depth estimation, etc. However, the high cost of annotation and labeling makes it challenging to achieve accurate results. We propose incorporating active learning into dense regression models to address this problem. Active learning allows models to select the most informative samp… ▽ More Dense regression is a widely used approach in computer vision for tasks such as image super-resolution, enhancement, depth estimation, etc. However, the high cost of annotation and labeling makes it challenging to achieve accurate results. We propose incorporating active learning into dense regression models to address this problem. Active learning allows models to select the most informative samples for labeling, reducing the overall annotation cost while improving performance. Despite its potential, active learning has not been widely explored in high-dimensional computer vision regression tasks like super-resolution. We address this research gap and propose a new framework called USIM-DAL that leverages the statistical properties of colour images to learn informative priors using probabilistic deep neural networks that model the heteroscedastic predictive distribution allowing uncertainty quantification. Moreover, the aleatoric uncertainty from the network serves as a proxy for error that is used for active learning. Our experiments on a wide variety of datasets spanning applications in natural images (visual genome, BSD100), medical imaging (histopathology slides), and remote sensing (satellite images) demonstrate the efficacy of the newly proposed USIM-DAL and superiority over several dense regression active learning methods. △ Less

Submitted 27 May, 2023; originally announced May 2023.

Comments: Accepted at UAI 2023

arXiv:2305.05159 [pdf, other]

Latent Interactive A2C for Improved RL in Open Many-Agent Systems

Authors: Keyang He, Prashant Doshi, Bikramjit Banerjee

Abstract: There is a prevalence of multiagent reinforcement learning (MARL) methods that engage in centralized training. But, these methods involve obtaining various types of information from the other agents, which may not be feasible in competitive or adversarial settings. A recent method, the interactive advantage actor critic (IA2C), engages in decentralized training coupled with decentralized execution… ▽ More There is a prevalence of multiagent reinforcement learning (MARL) methods that engage in centralized training. But, these methods involve obtaining various types of information from the other agents, which may not be feasible in competitive or adversarial settings. A recent method, the interactive advantage actor critic (IA2C), engages in decentralized training coupled with decentralized execution, aiming to predict the other agents' actions from possibly noisy observations. In this paper, we present the latent IA2C that utilizes an encoder-decoder architecture to learn a latent representation of the hidden state and other agents' actions. Our experiments in two domains -- each populated by many agents -- reveal that the latent IA2C significantly improves sample efficiency by reducing variance and converging faster. Additionally, we introduce open versions of these domains where the agent population may change over time, and evaluate on these instances as well. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2304.08393 [pdf, other]

Search for gravitational-lensing signatures in the full third observing run of the LIGO-Virgo network

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, C. Alléné, A. Allocca, P. A. Altin , et al. (1670 additional authors not shown)

Abstract: Gravitational lensing by massive objects along the line of sight to the source causes distortions of gravitational wave-signals; such distortions may reveal information about fundamental physics, cosmology and astrophysics. In this work, we have extended the search for lensing signatures to all binary black hole events from the third observing run of the LIGO--Virgo network. We search for repeated… ▽ More Gravitational lensing by massive objects along the line of sight to the source causes distortions of gravitational wave-signals; such distortions may reveal information about fundamental physics, cosmology and astrophysics. In this work, we have extended the search for lensing signatures to all binary black hole events from the third observing run of the LIGO--Virgo network. We search for repeated signals from strong lensing by 1) performing targeted searches for subthreshold signals, 2) calculating the degree of overlap amongst the intrinsic parameters and sky location of pairs of signals, 3) comparing the similarities of the spectrograms amongst pairs of signals, and 4) performing dual-signal Bayesian analysis that takes into account selection effects and astrophysical knowledge. We also search for distortions to the gravitational waveform caused by 1) frequency-independent phase shifts in strongly lensed images, and 2) frequency-dependent modulation of the amplitude and phase due to point masses. None of these searches yields significant evidence for lensing. Finally, we use the non-detection of gravitational-wave lensing to constrain the lensing rate based on the latest merger-rate estimates and the fraction of dark matter composed of compact objects. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: 28 pages, 11 figures

Report number: LIGO-P2200031

arXiv:2304.05995 [pdf, other]

APPLeNet: Visual Attention Parameterized Prompt Learning for Few-Shot Remote Sensing Image Generalization using CLIP

Authors: Mainak Singha, Ankit Jha, Bhupendra Solanki, Shirsha Bose, Biplab Banerjee

Abstract: In recent years, the success of large-scale vision-language models (VLMs) such as CLIP has led to their increased usage in various computer vision tasks. These models enable zero-shot inference through carefully crafted instructional text prompts without task-specific supervision. However, the potential of VLMs for generalization tasks in remote sensing (RS) has not been fully realized. To address… ▽ More In recent years, the success of large-scale vision-language models (VLMs) such as CLIP has led to their increased usage in various computer vision tasks. These models enable zero-shot inference through carefully crafted instructional text prompts without task-specific supervision. However, the potential of VLMs for generalization tasks in remote sensing (RS) has not been fully realized. To address this research gap, we propose a novel image-conditioned prompt learning strategy called the Visual Attention Parameterized Prompts Learning Network (APPLeNet). APPLeNet emphasizes the importance of multi-scale feature learning in RS scene classification and disentangles visual style and content primitives for domain generalization tasks. To achieve this, APPLeNet combines visual content features obtained from different layers of the vision encoder and style properties obtained from feature statistics of domain-specific batches. An attention-driven injection module is further introduced to generate visual tokens from this information. We also introduce an anti-correlation regularizer to ensure discrimination among the token embeddings, as this visual information is combined with the textual tokens. To validate APPLeNet, we curated four available RS benchmarks and introduced experimental protocols and datasets for three domain generalization tasks. Our results consistently outperform the relevant literature and code is available at https://github.com/mainaksingha01/APPLeNet △ Less

Submitted 12 April, 2023; originally announced April 2023.

Comments: 11 Pages, 6 figures, 8 tables, Accepted in Earth Vision (CVPR 2023)

arXiv:2304.01599 [pdf, other]

Sampling from the surface of a curved torus: A new genesis

Authors: Buddhananda Banerjee, Surojit Biswas

Abstract: The distributions of toroidal data, often viewed as an extension of circular distributions, do not consider the intrinsic geometry of a curved torus. For the first time, Diaconis et al. (2013)[Diaconis, P., Holmes, S., & Shahshahani, M. (2013). Sampling from a manifold. Advances in modern statistical theory and applications: a Festschrift in honor of Morris L. Eaton, 10, 102-125.] introduce unifor… ▽ More The distributions of toroidal data, often viewed as an extension of circular distributions, do not consider the intrinsic geometry of a curved torus. For the first time, Diaconis et al. (2013)[Diaconis, P., Holmes, S., & Shahshahani, M. (2013). Sampling from a manifold. Advances in modern statistical theory and applications: a Festschrift in honor of Morris L. Eaton, 10, 102-125.] introduce uniform distribution on the surface of a curved torus with respect to its surface area. But the suggested acceptance-rejection method of sampling from it rejects approximately half of the data. We propose a probabilistic transformation for sampling from the same distribution without losing data. In addition, we introduce a new genesis of random samples from some popular circular distributions using histogram-based acceptance-rejection sampling that uses a very thin envelope. The idea leads to generalizing for sampling from distributions on the surface of a curved torus with a high acceptance rate.Apart from reducing computational cost in the inferential study of different toroidal distributions, uniform sampling from the surface of a curve torus will be helpful to understand any unknown distribution on it. △ Less

Submitted 4 April, 2023; originally announced April 2023.

MSC Class: 53B12; 62D05; 60K35; 82-10

arXiv:2303.17269 [pdf, other]

doi 10.1093/mnras/stad1027

uGMRT observations of the hot-Saturn WASP 69b: Radio-Loud Exoplanet-Exomoon Survey II (RLEES II)

Authors: Mayank Narang, Apurva V. Oza, Kaustubh Hakim, P. Manoj, Himanshu Tyagi, Bihan Banerjee, Arun Surya, Prasanta K. Nayak, Ravinder K. Banyal, Daniel P. Thorngren

Abstract: Exomoons have so far eluded ongoing searches. Several studies have exploited transit and transit timing variations and high-resolution spectroscopy to identify potential exomoon candidates. One method of detecting and confirming these exomoons is to search for signals of planet-moon interactions. In this work, we present the first radio observations of the exomoon candidate system WASP 69b. Based… ▽ More Exomoons have so far eluded ongoing searches. Several studies have exploited transit and transit timing variations and high-resolution spectroscopy to identify potential exomoon candidates. One method of detecting and confirming these exomoons is to search for signals of planet-moon interactions. In this work, we present the first radio observations of the exomoon candidate system WASP 69b. Based on the detection of alkali metals in the transmission spectra of WASP-69b, it was deduced that the system might be hosting an exomoon. WASP 69b is also one of the exoplanet systems that will be observed as part of JWST cycle-1 GTO. This makes the system an excellent target to observe and follow up. We observed the system for 32 hrs at 150 MHz and 218 MHz using the upgraded Giant Metrewave Radio Telescope (uGMRT). Though we do not detect radio emission from the systems, we place strong $3σ$ upper limits of 3.3 mJy at 150 MHz and 0.9 mJy at 218 MHz. We then use these upper limits to estimate the maximum mass loss from the exomoon candidate. △ Less

Submitted 30 March, 2023; originally announced March 2023.

Comments: Accepted in MNRAS, 8 pages, 4 Figures

arXiv:2303.16223 [pdf, other]

A bright megaelectronvolt emission line in $γ$-ray burst GRB 221009A

Authors: Maria Edvige Ravasio, Om Sharan Salafia, Gor Oganesyan, Alessio Mei, Giancarlo Ghirlanda, Stefano Ascenzi, Biswajit Banerjee, Samanta Macera, Marica Branchesi, Peter G. Jonker, Andrew J. Levan, Daniele B. Malesani, Katharine B. Mulrey, Andrea Giuliani, Annalisa Celotti, Gabriele Ghisellini

Abstract: The highly variable and energetic pulsed emission of a long gamma-ray burst (GRB) is thought to originate from local, rapid dissipation of kinetic or magnetic energy within an ultra-relativistic jet launched by a newborn compact object, formed during the collapse of a massive star. The spectra of GRB pulses are best modelled by power-law segments, indicating the dominance of non-thermal radiation… ▽ More The highly variable and energetic pulsed emission of a long gamma-ray burst (GRB) is thought to originate from local, rapid dissipation of kinetic or magnetic energy within an ultra-relativistic jet launched by a newborn compact object, formed during the collapse of a massive star. The spectra of GRB pulses are best modelled by power-law segments, indicating the dominance of non-thermal radiation processes. Spectral lines in the X-ray and soft $γ$-ray regime for the afterglow have been searched for intensively, but never confirmed. No line features ever been identified in the high energy prompt emission. Here we report the discovery of a highly significant ($> 6 σ$) narrow emission feature at around $10$ MeV in the brightest ever GRB 221009A. By modelling its profile with a Gaussian, we find a roughly constant width $σ\sim 1$ MeV and temporal evolution both in energy ($\sim 12$ MeV to $\sim 6$ MeV) and luminosity ($\sim 10^{50}$ erg/s to $\sim 2 \times 10^{49}$ erg/s) over 80 seconds. We interpret this feature as a blue-shifted annihilation line of relatively cold ($k_\mathrm{B}T\ll m_\mathrm{e}c^2$) electron-positron pairs, which could have formed within the jet region where the brightest pulses of the GRB were produced. A detailed understanding of the conditions that can give rise to such a feature could shed light on the so far poorly understood GRB jet properties and energy dissipation mechanism. △ Less

Submitted 28 March, 2023; originally announced March 2023.

Comments: Submitted

Showing 1–50 of 203 results for author: Banerjee, B