
会议日程 主旨报告 特邀报告 专题论坛 女科学家论坛 讲习班 博士生论坛






    CAAI-PR 专委年度会议





主持人Visual Intelligence 编委会议主持人
12:30-13:20主旨报告:Understanding Egocentric Visual Attention and Actions
    报告人:Yoichi Sato
主持人PRCV   2025筹备会议主持人
13:20-14:00PRCV 2026竞选主持人

16:00-17:15口头报告:Recognition   1口头报告:Recognition   2口头报告:Segmentation口头报告:Detection
17:15-18:25 海报2
11:22-12:22口头报告:最佳论文与最佳学生论文候选2主持人11:30-12:40 海报3
16:00-17:15口头报告:Low-level   Vision任健康口头报告:Applications口头报告:Learning口头报告:3D   Vision
17:15-18:25 海报4

1Efficient   Fine-tuning Strategies for CLIP in Few-shot Scenarios via Supervised   Contrastive Learning
2Two   Semantic Information Extension Enhancement Methods For Zero-Shot Learning.
3Learning   Fine-grained and Semantically Aware Mamba Representations for Tampered Text   Detection in Images
4Multi-modality   Correlation Learning Network for Pediatric Ventricular Septal Defects   Identification
5A   Novel Method for Autism Identification based on Multi-Atlas Features  Fusion and Graph Neural Network
6From   Point to Surface: Realistic and Perceptually-Plausible Hazy Image Generation   with Glow-Diffusion
7Reducing   Memory Footprint in Deep Network Training by Gradient Space Reutilization
8Dynamic   Subframe Splitting and Spatio-Temporal Motion Entangled Sparse Attention for   RGB-E Tracking
口头报告:Recognition 1
1ADAL-GCN:   Action Description Aided Learning Graph Convolution Network for Early Action   Prediction
2Skeleton-Language   Pre-training to Collaborate with Self-Supervised Human Action Recognition
3Privacy-preserving   Action Recognition: A Survey
4Multi-scale   Spatial and Temporal Feature Aggregation Graph Convolutional Network for   Skeleton-Based Action Recognition
5Foreground-Background   Partitioning and Feature Fusion for Weakly Supervised Fine-grained Image   Recognition
口头报告:Recognition   2
1H2LMER:   A Cross Frame-Rate Representation Alignment Framework for Micro-Expression   Recognition
23DFaceMAE:   Pre-training of Masked Autoencoder using Patch-based Random Masking   Reconstruction and Super-resolution for 3D Face Recognition
3MFH:   Marrying Frequency Domain with Handwritten Mathematical Expression   Recognition
4Exploring   Out-of-distribution Scene Text Recognition for Driving Scenes with Hybrid   Test-time Adaptation
5Quat-DGNet:   Enhancing 3D Dense Captioning with Quaternion-Based Spatial Offsets and   Dynamic Neighborhood Graphs
1Meta-Learning   Based Knowledge Distillation for Domain Adaptive Nighttime Segmentation
2Semantics   Guided Disentangled GAN for Chest X-ray Image Rib Segmentation
3PottsNN:   A Variational Neural Network Based on Potts Model for Image Segmentation
4PRM:   A Pixel-Region-Matching Approach for Fast Video Object Segmentation
5RT-VIS:   Real-time Video Instance Segmentation with Light-weight Decoupled Framework
1Masked   Visual Pre-training for RGB-D and RGB-T Salient Object Detection
2Infrared   Small Target Detection via Edge Refinement and Joint Attention Enhancement
3Multi-Branch   Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional   for accurate object detection
4Completing   Saliency from Details
5Detect   Text Forgery with Non-Forged Image Features: A Framework for Detection and   Grounding of Image-Text Manipulation
口头报告:Low-Level   Vision
1DFANet:   A Dual-stream Deep Feature Aware Network for Multi-focus Image Fusion
2GAN-Diffusion   Relay Model: Advancing Semantic Image Synthesis
3Multiview   Light Field Angular Super-Resolution based on View Alignment and Frequency   Attention
4SFformer:   Adaptive Sparse and Frequency-Guided Transformer Network for Single Image   Derain
5Hyperspectral   Image Super-resolution Based on Dual-domain Gated Attention Network
口头报告:3D Vision
1MTFusion:   Reconstructing Any 3D Object from Single Image Using Multi-Word Textual   Inversion
2Disparity   Refinement Based on Cross-Modal Feature Fusion and Global Hourglass   Aggregation for Robust Stereo Matching
3M3Pose:   Multi-person 3D Pose Estimation Using Sparse Millimeter-wave Radar Point   Clouds
4Animatable   Human Rendering from Monocular Video via Pose-Independent Deformation
5Discriminative-guided   Diffusion-based Self-supervised Monocular Depth Estimation
1Unleashing   the Class-Incremental Learning Potential of Foundation Models by Virtual   Feature Generation and Replay
2SLRL:   Structured Latent Representation Learning for Multi-view Clustering
3SPLICEGNN:   SPLIt and ConnEct Tracklets in a unified Graph Neural Network
4 Few-Shot Class-Incremental Learning via   Cross-Modal Alignment with Feature Replay
5Self-Quantization   with Adaptive Codebooks for Unsupervised Image Retrieval
1Generative   Steganography Based on Dual-Branch Flow
2SACTGAN-EE   imbalanced data processing method for credit default prediction
3Misclassification   Detection via Counterexample Learning    for Trustworthy Cervical Cancer Screening
4DBMF-Net:   A Dual-Branch Multimodal Fusion Network for Multi-Label Sewer Defect   Classification
5Feature   Exchange and Distribution-based Mining Land Detection Method by Multispectral   Imagery

1Medical   lmage Processing and AnalysisA   Fine-grained Recurrent Network for Image Segmentation via Vector Field Guided   Refinement
2Medical   lmage Processing and AnalysisSemi-supervised   Medical Image Segmentation with Strong/Weak Task-aware Consistency
3Medical   lmage Processing and AnalysisSteerable   Pyramid Transform Enables Robust Left Ventricle Quantification
4Medical   lmage Processing and AnalysisMedPrompt:   Cross-Modal Prompting for Multi-Task Medical Image Translation
5Medical   lmage Processing and AnalysisEnhancing   Hippocampus Segmentation: SwinUNETR Model Optimization with CPS
6Medical   lmage Processing and AnalysisUncertainty-inspired   Credible Pseudo-Labeling in Semi-Supervised Medical Image Segmentation
7Medical   lmage Processing and AnalysisMFPNet:   Mixed Feature Perception Network for Automated Skin Lesion Segmentation
8Medical   lmage Processing and AnalysisLD-BSAM:Combined   Latent Diffusion with Bounding SAM for HIFU target region segmentation
9Medical   lmage Processing and AnalysisHierarchical   Decoder with Parallel Transformer and CNN for Medical Image Segmentation
11Medical   lmage Processing and AnalysisAPAN:Anti-curriculum   Pseudo-labelling and Adversarial Noises Training for Semi-supervised Medical   Image Classification
12Medical   lmage Processing and AnalysisMulti-Modal   Learning for Predicting the Progression of Transarterial Chemoembolization   Therapy in Hepatocellular Carcinoma
13Medical   lmage Processing and AnalysisGrowing   with the help of multiple teachers: lightweight and noise-resistant student   model for medical image classification
14Medical   lmage Processing and AnalysisDRA-CN:   A novel Dual-Resolution Attention Capsule Network for Histopathology Image   Classification
15Medical   lmage Processing and AnalysisA   Mask Guided Network for Self-Supervised Low-Dose CT Imaging
16Medical   lmage Processing and AnalysisDental   Diagnosis from X-Ray Panoramic Radiography Images: A Dataset and A Hybrid   Framework
17Medical   lmage Processing and AnalysisEdge-Guided   Bidirectional-Attention Residual Network for Polyp Segmentation
18Medical   lmage Processing and AnalysisFrom   Coarse to Fine: A Novel Colon Polyp Segmentation Method Like Human   Observation
19Medical   lmage Processing and AnalysisPseudo-Prompt   Generating in Pre-trained Vision-Language Models for Multi-Label Medical   Image Classification
20Medical   lmage Processing and AnalysisMulti-Perspective   Text-Guided Multimodal Fusion Network for Brain Tumor Segmentation
21Medical   lmage Processing and AnalysisContinual   Learning for Fundus Image Segmentation
22Medical   lmage Processing and AnalysisEmbedded   Deep Learning Based CT Images for Rifampicin Resistant Tuberculosis Diagnosis
23Medical   lmage Processing and AnalysisCombining   Segment Anything Model with Domain-Specific Knowledge for Semi-Supervised   Learning in Medical Image Segmentation
24Medical   lmage Processing and AnalysisMeply:   A Large-scale Dataset and Baseline Evaluations for Metastatic Perirectal   Lymph Node Segmentation
25Medical   lmage Processing and AnalysisSwin-HAUnet:   A Swin-Hierarchical Attention Unet For Enhanced Medical Image Segmentation
26Medical   lmage Processing and AnalysisODC-SA   Net: Orthogonal Direction Enhancement and Scale Aware Network for Polyp   Segmentation
27Medical   lmage Processing and AnalysisTwo-Stage   Multi-Scale Feature Fusion for Small Medical Object Segmentation
28Medical   lmage Processing and AnalysisA   Two-Stage Automatic Collateral Scoring Framework Based on Brain Vessel   Segmentation
29Medical   lmage Processing and AnalysisSPARK:   Cross-Guided Knowledge Distillation with Spatial Position Augmentation for   Medical Image Segmentation
30Medical   lmage Processing and AnalysisVATBoost-Net:   Integrating Enhanced Feature Perturbation and Detail Enhancement for Medical   Image Segmentation
31Medical   lmage Processing and AnalysisDTIL-Net:   Dual-Task Interactive Learning Network for Automated Grading of Diabetic   Retinopathy and Macular Edema
32Medical   lmage Processing and AnalysisDeformSegNet:   Segmentation Network Fused with Deformation Field for Pancreatic CT Scans
33Medical   lmage Processing and AnalysisInsSegLN:   A Novel 3D Instance Segmentation Method for Mediastinal Lymph Node
34Medical   lmage Processing and AnalysisRRANet:   A Reverse Region-Aware Network with Edge Difference for Accurate Breast Tumor   Segmentation in Ultrasound Images
35Medical   lmage Processing and AnalysisLearning   Frequency and Structure in UDA for Medical Object Detection
36Medical   lmage Processing and AnalysisSkin   Lesion Segmentation Method Based On    Global Pixel Weighted Focal Loss
37Medical   lmage Processing and AnalysisCompeting   Dual-Network with Pseudo-Supervision Rectification for Semi-Supervised   Medical Image Segmentation
38Medical   lmage Processing and AnalysisDual-Branch   Perturbation and Conflict-Based Scribble-Supervised Meibomian Gland   Segmentation
39Medical   lmage Processing and AnalysisAnchored   Supervised Contrastive Learning for Long-Tailed Medical Image Regression
40Medical   lmage Processing and AnalysisDynamic   Feature Fusion Based on Consistency and Complementarity of Brain Atlases
41Medical   lmage Processing and AnalysisFUF-TransUNet:   a transformer-based U-Net with fully utilize of features for liver and   liver-tumor segmentation in CT images
42Medical   lmage Processing and AnalysisDual-View   Dual-Boundary Dual U-Nets for Multiscale Segmentation of Oral CBCT Images
43Medical   lmage Processing and AnalysisA   Novel Diffusion Model with Wavelet Transform for Optic Disc and Cup   Segmentation in Fundus Images
44Medical   lmage Processing and AnalysisSTCTb:   A Spatio-Temporal Collaborative Transformer Block for Brain Diseases   Classification using fMRI Time Series.
45Medical   lmage Processing and AnalysisA   Generalized Contrast-adjustment Guided Growth Method for Medical Image   Segmentation
46Medical   lmage Processing and AnalysisMDNet:   Morphology-Driven Weakly Supervised Polyp Detection
47Medical   lmage Processing and AnalysisMMR-Sleep:   A Multi-Channel and  Multi-Receptive   Field Sleep Stage recognition  Model
48Medical   lmage Processing and AnalysisCPNet:   Cross Prototype Network for Few-shot Medical Image Segmentation
49Medical   lmage Processing and AnalysisSBC-UNet:   A Network Based on Improved Hourglass Attention Mechanism and U-Net for   Medical Image Segmentation
50Medical   lmage Processing and AnalysisBridge   the gap of semantic context: A Boundary-guided Context Fusion UNet for   Medical Image Segmentation
51Medical   lmage Processing and AnalysisBilinear   Fine-grained Classification of Ultrasound Images Integrated with   Interpretable Radiomics
52Medical   lmage Processing and AnalysisGCNet:   Global context-guided uncertainty boundary for polyp segmentation
53Medical   lmage Processing and AnalysisComprehensive   Transformer Integration Network (CTIN): Advancing Endoscopic Disease   Segmentation with Hybrid Transformer Architecture
54Medical   lmage Processing and AnalysisIPM:   An Intelligent Component for 3D Brain Tumor Segmentation Integrating Semantic   Extractor and Pixel Refiner
55Medical   lmage Processing and AnalysisEdge-Net:   A Self-supervised Medical Image Segmentation Model Based on Edge Attention
56Medical   lmage Processing and AnalysisFundus   image disease diagnosis and quality assessment based on dual-task   collaborative optimization
57Medical   lmage Processing and AnalysisMFIS-net:   A Deep Learning Framework for Left Atrial Segmentation
58Medical   lmage Processing and AnalysisSemi-Supervised   Gland Segmentation via Label Purification and Reliable Pixel Learning
59Multi-Modal   Information ProcessingA   Multi-modal Framework with Contrastive Learning and Sequential Encoding for   Enhanced Sleep Stage Detection
60Multi-Modal   Information ProcessingCharting   the Uncharted: Building and Analyzing a Multifaceted Chart Question   Answerring Dataset for Complex Logical Reasoning Process
61Multi-Modal   Information ProcessingTime-Frequency   Mutual Learning for Moment Retrieval and Highlight Detection
62Multi-Modal   Information ProcessingCascade   Coarse-to-Fine Point-Query Transformer for RGB-T Crowd Counting
63Multi-Modal   Information ProcessingPerceptual   Image Compression with Text-Guided Multi-Level Fusion
64Multi-Modal   Information ProcessingEvaluating   Attribute Comprehension in Large Vision-language Models
65Multi-Modal   Information ProcessingEfficient   Multi-modal Human-centric Contrastive Pre-training with A Pseudo   Body-structured Prior
66Multi-Modal   Information ProcessingA3R:   Vision Language Pre-training by Attentive Alignment and Attentive   Reconstruction
67Multi-Modal   Information ProcessingMixture-of-Hand-Experts:   Repainting the Deformed Hand Images Generated by Diffusion Models
68Multi-Modal   Information ProcessingConD2:   Contrastive Decomposition Distilling for Multimodal Sentiment Analysis
69Multi-Modal   Information ProcessingMulti-layer   Tuning CLIP for Few-Shot Image Classification
70Multi-Modal   Information ProcessingDIM:   Dynamic Integration of Multimodal Entity Linking with Large Language Model
71Multi-Modal   Information ProcessingText-Dominant   Interactive Attention for Cross-Modal Sentiment Analysis.
72Multi-Modal   Information ProcessingDual   Context Perception Transformer for Referring Image Segmentation
73Multi-Modal   Information ProcessingELEMO:   Elements Focused Emotion Recognition for Sticker Images
74Multi-Modal   Information ProcessingCross-Modal   Dual Matching and Comparison for Text-to-Image Person Re-identification
75Multi-Modal   Information ProcessingLow-resource   Machine Translation with Different Granularity Image Features
76Multi-Modal   Information ProcessingST-SBV:   Spatial-Temporal Self-Blended Videos for Deepfake Detection
77Multi-Modal   Information ProcessingLearning   a Robust Synthetic Modality with Dual-Level Alignment for Visible-Infrared   Person Re-identification
78Multi-Modal   Information ProcessingDeep   Noisy Multi-Label Learning for Robust Cross-Modal Retrieval
79Multi-Modal   Information ProcessingUncertainty-Aware   with Negative Samples for Video-Text Retrieval
80Multi-Modal   Information ProcessingMulti-Modal   Knowledge-enhanced Fine-Grained Image Classification
81Multi-Modal   Information ProcessingBridging   Modality Gap for Visual Grounding with Effecitve Cross-modal Distillation
82Multi-Modal   Information ProcessingEGSRNet:   Emotion-label Guiding and Similarity Reasoning Network for Multimodal   Sentiment Analysis
83Multi-Modal   Information ProcessingVL-MPFT:   Multitask Parameter-Efficient Fine-Tuning for Visual-Language Pre-trained   Models via Task-adaptive Masking
84Multi-Modal   Information ProcessingA   Multimodal Fake News Detection Model Leveraging Image Frequency and Spatial   Domain Analysis with Deep Dynamic Trade-off Fusion
85Multi-Modal   Information ProcessingEfficiency-Aware   Fine-grained Vision-Language Retrieval via a Global-Contextual Autoencoder
86Multi-Modal   Information ProcessingTowards   Making the Most of Knowledge across Languages for Multimodal Cross-Lingual   Summarization
87Multi-Modal   Information ProcessingAdapting   Vision-Language Models to Open Classes via Test-Time Prompt Tuning
88Multi-Modal   Information ProcessingMultimodal   Feature Hierarchical Alignment for Text-Based Person Re-identification
89Multi-Modal   Information ProcessingMitigating   Hallucination in Visual-Language Models via Re-Balancing Contrastive Decoding
90Multi-Modal   Information ProcessingMultimodal   medical image registration using optimized phase consistency within joint   Frequency-Space domain
91Multi-Modal   Information ProcessingRobust   Contrastive Learning against Audio-Visual Noisy Correspondence
92Multi-Modal   Information ProcessingEnhancing   Cross-Modal Alignment in Multimodal Sentiment Analysis via Prompt Learning
93Multi-Modal   Information ProcessingEfficient   Language-driven Action Localization by Feature Aggregation and Prediction   Adjustment
94Multi-Modal   Information ProcessingGreedy   Fusion Oriented Representations for Multimodal Sentiment Analysis
95Multi-Modal   Information ProcessingCounterfactual   Multimodal Fact-Checking Method based on Causal Intervention
96Multi-Modal   Information ProcessingRethinking   the Necessity of Learnable Modal Alignment for Medical Image Fusion
97Multi-Modal   Information ProcessingTaming   Diffusion for Fashion Clothing Generation with Versatile Condition
98Video   Analysis and UnderstandingBehavior   Capture Based Explainable Engagement Recognition
99Video   Analysis and UnderstandingUtilizing   Text-video Relationships: A Text-driven Multi-modal Fusion Framework for   Moment Retrieval and Highlight Detection
100Video   Analysis and UnderstandingLLMAction:   Adapting Large Language Model for Long-Term Action Anticipation
101Video   Analysis and UnderstandingDual-scale   Temporal Dependency Learning for Unsupervised Video Anomaly Detection
102Video   Analysis and UnderstandingLMS-VDR:   Integrating Landmarks into Multi-Scale Hybrid Net for Video-based Depression   Recognition
103Video   Analysis and UnderstandingMotion   Trajectory Reconstruction Based on Feature Matching and Gradient Graph   Laplacian Regularizer
104Video   Analysis and UnderstandingOC-SAN:   Unsupervised Deepfake Detection for Specific Individual Protection Based on   Deep One-class Classification
105Video   Analysis and UnderstandingFocus   on Subtle Actions: Semantic and Saliency Knowledge Co-Propagation Method for   Weakly-Supervised Temporal Action Localization
106Video   Analysis and UnderstandingFlow-Audio-Synth:A   Video-to-Audio Model which Captures Dynamic Features
107Video   Analysis and UnderstandingSCAMS:   Semantic Category-Aware Multi-Scale Network for Video Quality Assessment
108Video   Analysis and UnderstandingCross-temporal   Fusion Memory Network for Traffic Accident Detection
109Video   Analysis and UnderstandingVideo   Frame Interpolation for Large Motion with Generative Prior
110Video   Analysis and UnderstandingMS-DETR:   Exploiting Modality Synergy for Moment Retrieval and Highlight Detection
111Video   Analysis and UnderstandingPosCap:   Boosting Video Captioning with Part-of-Speech Guidance
112Video   Analysis and UnderstandingElement-Centered   Multi-Granularity Network for Dense Video Captioning
113Video   Analysis and UnderstandingAI-Generated   Video Detection via Spatial-Temporal Anomaly Learning
114Video   Analysis and UnderstandingDynamic   Temporal Shift Feature Enhancement for Few-Shot Action Recognition
115Multimedia   Analysis and ReasoningStyleFusion   TTS: Multimodal Style-control and Enhanced Feature Fusion for Zero-shot   Text-to-speech Synthesis
116Multimedia   Analysis and ReasoningRobust   Document Presentation Attack Detection via Diffusion Models and Knowledge   Distillation
117Multimedia   Analysis and ReasoningTowards   the transferable reversible adversarial example via distribution-relevant   attack
118Multimedia   Analysis and ReasoningFine-grained   Feature Assisted Cross-modal Image-text Retrieval
119Multimedia   Analysis and ReasoningUncertainty-aware   Gradient Modulation and Feature Masking for Multimodal Sentiment Analysis
120Multimedia   Analysis and ReasoningLocal   and Global Features Interactive Fusion Network for Macro- and   Micro-Expression Spotting in Long Videos
121Multimedia   Analysis and ReasoningOmniStyleGAN   for Style-Guided Image-to-Image Translation
122Fundamental   Theory of Computer VisionMiss-CAM:Visual   interpretation algorithm for convolutional neural networks using missingness   masks
123Fundamental   Theory of Computer VisionGSE-Ships:   Ship Detection Using Optimized Lightweight Networks and Attention Mechanisms
124Fundamental   Theory of Computer VisionBalancing   Complementarity and Consistency via Delayed Activation in Incomplete   Multi-view Clustering
125Fundamental   Theory of Computer VisionSmall   Target Defects Detection of Aluminum Plates Sur-face using an MSN-YOLOv5   model
126Performance   Evaluation and BenchmarksBenchmarking   Multi-Scene Fire and Smoke Detection
127Performance   Evaluation and BenchmarksPerformance   Evaluation of Anomaly Detection with a new Battery Surface Anomaly Dataset
128Performance   Evaluation and BenchmarksFine-grained   Metrics for Point Cloud Semantic Segmentation
129Performance   Evaluation and Benchmarks114Xray:   A Large-scale X-ray Security Detection Benchmark and Aware Enhance Network   for Real-World Prohibited Item Inspection in Baggage
130Structural   Pattern RecognitionEnhancing   Semi-Dense Feature Matching through Probabilistic Modeling of Cascaded   Supervision and Consistency
131Structural   Pattern RecognitionConcentrating   Estimation Attention: Human Prior Constrained Methods for Robust   Classification

1Object   Detection Tracking and IdentificationMamba-FETrack:   Frame-Event Tracking via State Space Model
2Object   Detection Tracking and IdentificationStay   Open: Calibrating Weights Continuously for Detecting Out-of-Distribution   Objects On the Fly
3Object   Detection Tracking and IdentificationMPE:   A Fine-grained Multi-Path Feature Enhancer in MOT
4Object   Detection Tracking and IdentificationCIMTD:   Class Incremental Multi-Teacher Knowledge Distillation for Fractal Object   Detection
5Object   Detection Tracking and IdentificationA   Stochastic Model for video object tracking
6Object   Detection Tracking and IdentificationClass-Agnostic   Detection of Unknown Objects From Foreground Improves Robust Open World   Object Detection
7Object   Detection Tracking and IdentificationML-SCODNet:   Multitask Learning for Scene Classification and Object Detection Network from   Remote Sensing Images 
8Object   Detection Tracking and IdentificationCross-Domain   Attention Alignment for Domain Adaptive Person Re-ID
9Object   Detection Tracking and IdentificationEfficientMatting:   Bilateral Matting Network for Real-time Human Matting
10Object   Detection Tracking and IdentificationSemi-Supervised   Camouflaged Object Detection: Multi Information Fusion Combined with Adaptive   Receptive Field Selection Network
11Object   Detection Tracking and IdentificationRevisiting   Network Perturbation for Semi-Supervised Semantic Segmentation
12Object   Detection Tracking and IdentificationTL-RelD:   Tight-Loose Pairwise Loss for Object Re-ldentification
13Object   Detection Tracking and IdentificationLS-YOLO:   A Lightweight Selective Enhanced YOLOv8 Algorithm for UAV Aerial Photography
14Object   Detection Tracking and IdentificationCamouflaged   Object Detection via Scale-Feature Attention and Type-Feature Attention
15Object   Detection Tracking and IdentificationGOP:   A Group Object Perception Framework for Optical Remote Sensing
16Object   Detection Tracking and IdentificationMask-Guided   Clothes-irrelevant and Background-irrelevant Network with Knowledge   Propagation for Cloth-Changing Person Re-Identification
17Object   Detection Tracking and IdentificationRegion   Aware Transformer with Intra-Class Compact for Unsupervised Aerial Person   Re-identification
18Object   Detection Tracking and IdentificationCamouflaged   Object Detection based on Feature Aggregation and Global Semantic Learning
19Object   Detection Tracking and IdentificationSmall   Target Detector based on Adaptive Re-parameterized Spatial Feature Fusion   Mechanism
20Object   Detection Tracking and IdentificationDual   Constraint Parallel Multi-scale Attention Network for Insulator Detection in   Foggy Scene
21Object   Detection Tracking and IdentificationLightweight   Defog Detection for Autonomous Vehicles: Balancing Clarity, Efficiency, and   Accuracy
22Object   Detection Tracking and IdentificationEI-YOLO:   Efficiently Improved YOLO on Detection of Prohibited Items During Security   inspections
23Object   Detection Tracking and IdentificationTSTrack:   A Robust Object Tracking Framework Integrated Temporal and Spatial Features
24Object   Detection Tracking and IdentificationA   Faster Fire Detection Network with Global Information Awareness
25Object   Detection Tracking and IdentificationMore   Efficient Encoder: Boosting Transformer-Based Multi-Object Tracking   Performance Through YOLOX
26Object   Detection Tracking and IdentificationPLRUT:   Pseudo Label and Re-detection boosted Unsupervised Tracking of Unmanned   Aerial Vehicle Objects
27Object   Detection Tracking and IdentificationEnd-to-end   High-quality Transformer Object Detection Model Applied to Human Head   Detection
28Object   Detection Tracking and IdentificationLocal   Point Matching for Collaborative Image Registration and RGBT Anti-UAV   Tracking
29Object   Detection Tracking and IdentificationORU-YOLO:   A UAV Image Detection Model Optimized for Resource Utilization
30Object   Detection Tracking and IdentificationMemoryless   Multimodal Anomaly Detection via Student-Teacher Network and Signed Distance   Learning
31Object   Detection Tracking and IdentificationMRFNet:   A Multi-Receptive-Field Fusion Network for Multi-Food Recognition
32Object   Detection Tracking and IdentificationImage-Centered   Pseudo Label Generation for Weakly Supervised Text-based Person   Re-Identification
33Object   Detection Tracking and IdentificationSelfLoc:   High Quality Unsupervised Object Localization with Self-Prompt SAM
34Object   Detection Tracking and IdentificationA   Global Re-detection Method Based on Feature Interaction Siamese Network
35Object   Detection Tracking and IdentificationKey   Object Detection: Unifying Salient and Camouflaged Object Detection into One   Task
36Object   Detection Tracking and IdentificationMOFTrack:   Multi-Object Formation Tracking in Remote Sensing Videos
37Object   Detection Tracking and IdentificationSDNet:   A Simple and Efficient Salient Object Detection Decoder with Only 60K   Parameters
38Object   Detection Tracking and IdentificationScale-Adaptive   Modulation Meet Compact Axial Transformer for Small Object Detection in   UAV-Vision
39Object   Detection Tracking and IdentificationLightweight   and Multi-Scale Adaptive Network for Infrared Small Target Detection
40Object   Detection Tracking and IdentificationMulti-View   Cross-Attention Network for Hyperspectral Object Tracking
41Object   Detection Tracking and IdentificationCountMamba:   Exploring Multi-directional Selective State-Space Models for Plant Counting
42Object   Detection Tracking and IdentificationECLNet:   A Compact Encoder-Decoder Network for Efficient Camouflaged Object Detection
43Object   Detection Tracking and IdentificationFew-Shot   Object Detection via Disentangling Class-Related Factors in Feature   Distribution
44Object   Detection Tracking and IdentificationMulti-class   token-guided end-to-end weakly supervised image semantic segmentation method
45Object   Detection Tracking and IdentificationDIDNet:   An End-to-End Directional Insulator Detection Network based on direction   field
46Object   Detection Tracking and IdentificationL2FIG-Tracker:   l2-norm based Fusion with Illumination Guidance for RGB-D Object Tracking
47Object   Detection Tracking and IdentificationCDAF3D:   Cross-Dimensional Attention Fusion for Indoor 3D Object Detection
48Object   Detection Tracking and IdentificationRETrack:   Multi-Object Tracking by Associating Proposal Regions
49Object   Detection Tracking and IdentificationPGNET:   A Real-time efficient model for underwater object detection
50Object   Detection Tracking and IdentificationA   Temporal Recognition Framework for Multi-Sheep Behaviour Using ViTSORT and   YOLOv8-MS
51Object   Detection Tracking and IdentificationTracking   Transforming Objects: A Benchmark
52Object   Detection Tracking and IdentificationModality-Shared   Prototypes for Enhanced Unsupervised Visible-Infrared Person   Re-identification
53Object   Detection Tracking and IdentificationVehicle   Re-identification with a Pose-aware Discriminative Part Learning Model
54Object   Detection Tracking and IdentificationDual-Teacher   Network with SSIM based Reverse Distillation for Anomaly Detection
55Object   Detection Tracking and IdentificationCFMVOR:   Federated Multi-view 3D Object Recognition Based on Compressed Learning
56Object   Detection Tracking and IdentificationEnhanced   Anomaly Detection using Spatial-Alignment and Multi-scale Fusion
57Object   Detection Tracking and IdentificationConfidence-Weighted   Teacher: Semi-Supervised Object Detection Based on Confidence Correction
58Biometric   RecognitionMST-Gait:Application   of Multi-Scale Temporal Modeling to Gait Recognition
59Biometric   RecognitionIdentity-Preserving   Animal Image Generation for Animal Individual Identification
60Biometric   RecognitionFIL-FLD:   Few-shot Incremental Learning with EMD Metric for High Generalization   Fingerprint Liveness Detection
61Biometric   RecognitionText   Based Unsupervised Domain Generalization Person Re-identification
62Biometric   RecognitionSF-Gait:   Two-Stage Temporal Compression Network for Learning Gait Micro-Motions and   Cycle Patterns
63Biometric   RecognitionCoarse-to-Fine   Domain Adaptation for Cross-subject EEG Emotion Recognition with Contrastive   Learning
64Biometric   RecognitionFace   Anti-spoofing based on Multi-view Anomaly Detection
65Biometric   RecognitionOnline   Signature Verification Based on Recurrent Attentional Time-Delay Neural   Networks
66Biometric   RecognitionMultimodal   finger recognition based on feature fusion    attention for fingerprints, finger-veins, and  f    inger-knuckle-prints
67Biometric   RecognitionHierarchical   Discrepancy-aware Interaction Network for Face Forgery Detection
68Biometric   RecognitionAU-vMAE:   Knowledge-Guide Action Units Detection via Video Masked Autoencoder
69Biometric   RecognitionTransformer-based   Multimodal Spatial-Temporal Fusion for Gait Recognition
70Biometric   RecognitionMulti-level   Distributional Discrepancy Enhancement for Cross Domain Face Forgery   Detection
71Biometric   RecognitionUnsupervised   person Re-ID based on nonlinear asymmetric metric learning
72Biometric   RecognitionFR-watermarking:   A Fusion Framework for Face-Based Digital Watermarking
73Document   Analysis and RecognitionLeveraging   Structure Knowledge and Deep Models for the Detection of Abnormal Handwritten   Text
74Document   Analysis and RecognitionOCR-aware   Scene Graph Generation via Multi-modal Object Representation Enhancement and   Logical Bias Learning
75Document   Analysis and RecognitionEnhancing   Transformer-based Table Structure Recognition for Long Tables
76Document   Analysis and RecognitionShow   Exemplars and Tell Me What You See: In-context Learning with Frozen Large   Language Models for TextVQA
77Document   Analysis and RecognitionMLR-NET:   an arbitrary skew angle detection algorithm for complex layout document   images
78Document   Analysis and RecognitionTextViTCNN:   Enhancing Natural Scene Text Recognition with Hybrid Transformer and   Convolutional Networks
79Document   Analysis and RecognitionEnhancing   Visual Information Extraction with Large Language Models through Layout-aware   Instruction Tuning
80Document   Analysis and RecognitionSFENet:   Arbitrary Shapes Scene Text Detection with Semantic Feature Extractor
81Document   Analysis and RecognitionImproving   Zero-Shot Image Captioning Efficiency with Metropolis-Hastings Sampling
82Document   Analysis and RecognitionImproving   Text Classification Performance through Multimodal Representation
83Document   Analysis and RecognitionA   Multi-feature Fusion Approach for Words Recognition of Ancient Mongolian   Documents
84Document   Analysis and RecognitionTableRocket:   An Efficient and Effective Framework for Table Reconstruction
85Document   Analysis and RecognitionNot   All Texts Are the Same: Dynamically Querying Texts for Scene Text Detection
86Document   Analysis and RecognitionMulti-Modal   Attention based on 2D Structured Sequence for Table Recognition
87Character   RecognitionScene   Text Recognition via k-NN Attention-based Decoder and Margin-based Softmax   Loss
88Character   RecognitionReal-Time   Text Detection with Multi-Level    Feature Fusion and Pixel Clustering
90Character   RecognitionDual   Feature Enhanced Scene Text Recognition Method for Low-Resource Uyghur
91Character   RecognitionSegmentation-free   Todo Mongolian OCR and Its  Public   Dataset
92Character   RecognitionHybrid   Encoding Method for Scene Text Recognition in Low-Resource Uyghur
93Character   RecognitionROBC:   a Radical-Level Oracle Bone Character Dataset
94Character   RecognitionIntegrated   Recognition of Arbitrary-Oriented Multi-Line Billet Number
95Character   RecognitionImproving   Scene Text Recognition with Counting Aware Contrastive Learning and Attention   Alignment
96Character   RecognitionGridMask:   An Efficient Scheme for Real Time Curved Scene Text Detection
97Character   RecognitionTibetan   Handwriting Recognition Method based on Structural Re-parameterization ViT   and Vertical Attention
98Face   Recognition and Pose RecognitionMPM:   A Unified 2D-3D Human Pose Representation via Masked Pose Modeling
99Face   Recognition and Pose RecognitionJoint   Multi-Cue Learning for Emotion Recognition in Human-Computer Interaction
100Face   Recognition and Pose RecognitionDepth   Decoupling for Bottom-Up Multi-Person 3D Pose Estimation
101Face   Recognition and Pose RecognitionTT-DF:   A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery   Detection
102Face   Recognition and Pose RecognitionWalking   is Matter: A Benchmark for Fine-Grained Gait Segmentation 
103Face   Recognition and Pose RecognitionSpatial-Frequency   Dual-stream Reconstruction for Deepfake Detection
104Face   Recognition and Pose RecognitionDeepSweep:   Real-Time Multi-View 3D Pose Estimation via Cross-View Deep Matching and   Plane Sweeping
105Face   Recognition and Pose RecognitionPoseVR:Structure-aware   Hybrid Full-Body Pose Estimation in Virtual Reality
106Face   Recognition and Pose RecognitionFusion   Network Based on Motion Learning and Image Feature Representation for   Micro-expression Recognition
107Face   Recognition and Pose RecognitionDepth-Aware   Dual-stream Interactive Transformer Network for Facial Expression Recognition
108Face   Recognition and Pose RecognitionSCALE-Pose:   Skeletal Correction and Language Knowledge-assisted for 3D Human Pose   Estimation
109Action   RecognitionA   Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human   Interaction Recognition
110Action   RecognitionSpatio-Temporal   Contrastive Learning for Compositional Action Recognition
111Action   RecognitionPath-Guided   Motion Prediction with Multi-View Scene Perception
112Action   RecognitionAttention-based   Spatio-temporal modeling with 3D Convolutional Neural Networks for Dynamic   Gesture Recognition
113Action   RecognitionMIT:   Multi-cue Injected Transformer for Two-stage HOI Detection
114Action   RecognitionDIDA:   Dynamic Individual-to-integrateD Augmentation for Self-Supervised   Skeleton-Based Action Recognition
115Action   RecognitionImproving   Video Representation of Vision-Language Model with Decoupled Explicit   Temporal Modeling
116Action   RecognitionKS-FuseNet:   An efficient action recognition method based on keyframe selection and   feature fusion
117Action   RecognitionDynamic   Skeleton Association Transformer for dyadic Interaction Action Recognition
118Action   RecognitionSpecies-Aware   Guidance for Animal Action Recognition with Vision-Language Knowledge
119Feature   Extraction and Feature SelectionCredit-based   Negative Sample Denoising in Contrastive Learning
120Feature   Extraction and Feature SelectionUDD:   Dataset Distillation via Mining Underutilized Regions
121Feature   Extraction and Feature SelectionBertTab:   Table Learning with Feature Descriptions and Context
122Feature   Extraction and Feature SelectionSemVG:   Semantic Fused Feature Extraction Network for Visual Geo-localization under   Urban Street Scenes
123Feature   Extraction and Feature SelectionEfficient   Discriminative Feature Selection With Grouping Relative Comparison
124Feature   Extraction and Feature SelectionBlock   Cipher Algorithm Identification Based On CNN-Transformer Fusion Model
125Feature   Extraction and Feature SelectionDSTF:   Dual-Stream Spatio-Temporal Fusion Network for Event-Based Data
126Pattern   Classification and Cluster AnalysisAdapt   and Refine: A Few-Shot Class-Incremental Learner via Pre-trained Models
127Pattern   Classification and Cluster AnalysisLearning   Fully Parametric Subspace Clustering
128Pattern   Classification and Cluster AnalysisA   Comprehensive Exploration on Detecting Fake Images Generated by Stable   Diffusion
129Pattern   Classification and Cluster AnalysisAdaptive   Margin Global Classifier for Exemplar-Free Class-Incremental Learning
130Pattern   Classification and Cluster AnalysisFedHC:   Learning Imbalanced Clusters via Federated Hierarchical Clustering
131Pattern   Classification and Cluster AnalysisEnhancing   Time Series Classification with Explainable Time-frequency Features   Representation
132Pattern   Classification and Cluster AnalysisAdaptive   Unified Framework with Global Anchor Graph for Large-scale Multi-view   Clustering

1Low-Level   Vision and lmage ProcessingFocal   Perception Transformer for Light Field Salient Object Detection
2Low-Level   Vision and lmage ProcessingA   Fourier Transform Framework for Domain Adaptation
3Low-Level   Vision and lmage ProcessingBidirectional   Alternating Fusion Network for RGB-T Salient Object Detection
4Low-Level   Vision and lmage ProcessingSimpleFusion:   A Simple Fusion Framework for Infrared and Visible Images
5Low-Level   Vision and lmage ProcessingA   Cross-Consistency Strategy for Clearer Perception in Low-Light Haze
6Low-Level   Vision and lmage ProcessingImprove   Segmentation robustness of Intracellular Structures in Fluorescence   Microscopy Images
7Low-Level   Vision and lmage ProcessingRDSR:Reparameterized   Lightweight Diffusion Model For Image Super-Resolution
8Low-Level   Vision and lmage ProcessingACENet:   Adaptive Context Enhancement Network for RGB-T Video Object Detection
9Low-Level   Vision and lmage ProcessingFake-GPT:   Detecting Fake Image via Large Language Model
10Low-Level   Vision and lmage ProcessingTowards   Elastic Image Super-Resolution Network via Progressive Self-distillation
11Low-Level   Vision and lmage ProcessingPHANet:   Progressive Hybrid Attention Network for Enhanced Video Deraining
12Low-Level   Vision and lmage ProcessingFrequency   Adapter and Spatial Prompt Network for All-in-One Blind Image Restoration
13Low-Level   Vision and lmage ProcessingAmbient   Illumination Disentangled Based Weakly-Supervised Image Restoration Using   Adaptive Pixel Retention Factor
14Low-Level   Vision and lmage ProcessingF4SR:   A Feed-Forward Regression Approach for Few-Shot Face Super-Resolution
15Low-Level   Vision and lmage ProcessingThangka   Mural Super-Resolution Based on Nimble Convolution and Overlapping Window   Transformer
16Low-Level   Vision and lmage ProcessingEHAT:Enhanced   Hybrid Attention Transformer for Remote Sensing Image Super-Resolution
17Low-Level   Vision and lmage ProcessingMIAFusion:   Infrared and Visible Image Fusion via Multi-scale Spatial and Channel-Aware   Interaction Attention
18Low-Level   Vision and lmage ProcessingColor   Enhanced Network for Image Dehazing
19Low-Level   Vision and lmage ProcessingAdvancing   Real-World Burst Denoising: A New Benchmark and Dual-Branch Burst Denoising   Network
20Low-Level   Vision and lmage ProcessingDIFNet:   Dual-Domain Information Fusion  Network   for Image Denoising
21Low-Level   Vision and lmage ProcessingSimultaneous   Snow Mask Prediction and Single Image Desnowing with a Bidirectional   Attention Transformer Network
22Low-Level   Vision and lmage ProcessingDBIF:   Dual-Branch Feature Extraction Network for Infrared and Visible Image Fusion
23Low-Level   Vision and lmage ProcessingMulti-dimensional   Information Awareness Residual Network for Lightweight Image Super-Resolution
24Low-Level   Vision and lmage ProcessingFeature   Pruning and Recovery Learning with Knowledge Distillation for Occluded Person   Re-Identification
25Low-Level   Vision and lmage ProcessingA   Redundancy-suppression based Event Sampling Method for Structured   Representation
26Low-Level   Vision and lmage ProcessingBSDiff:   Low-Light Image Enhancement Via Blueprint Separable Convolution and   Wavelet-Diffusion Model
27Low-Level   Vision and lmage ProcessingSAM   and Diffusion Based Adversarial Sample Generation for Image Quality   Assessment
28Low-Level   Vision and lmage ProcessingCF-LAM:   Coarse-to-Fine Locally Affine Matching for Viewpoint Transformations
29Low-Level   Vision and lmage ProcessingFormerUnify:   Transformer-Based Unified Fusion for Efficient Image Matting
30Low-Level   Vision and lmage ProcessingCamouflage   Object Segmentation with Multi-scale Feature Aggregation and Boundary   Generation
31Low-Level   Vision and lmage ProcessingGCMLP:   A Lightweight Network for Gamut Compression
32Low-Level   Vision and lmage ProcessingLow-Light   Light-Field Image Enhancement with Geometry Consistency
33Low-Level   Vision and lmage ProcessingAttention   and Boundary Induced Feature Refinement Network for Camouflaged Object   Detection
34Low-Level   Vision and lmage ProcessingTowards   Specular Highlight Removal Through Diffusion Model
35Low-Level   Vision and lmage ProcessingFIR:   A plug-in Feature-to-Image Reconstruction Method for Feature Coding for   Machines
36Low-Level   Vision and lmage ProcessingFocal   Aggregation Transformer for Light Field Image Super-Resolution
37Low-Level   Vision and lmage ProcessingCoMoFusion:   Fast and High-quality Fusion of Infrared and Visible Image with Consistency   Model
38Low-Level   Vision and lmage ProcessingPatch   Attacks on Vision Transformer via Skip Attention Gradients
39Low-Level   Vision and lmage ProcessingMulti-scale   Progressive Reconstruction Network for High Dynamic Range Imaging
40Low-Level   Vision and lmage ProcessingFine-Grained   Adjustable Entropy Models for Rate-Complexity Jointly Adjustable Image   Compression
41Low-Level   Vision and lmage ProcessingFPSNet:   Focus-Perceptual-Semantic Full Flow Visual Redundancy Predicting for Camera   Image
42Low-Level   Vision and lmage ProcessingSemantic-Aware   Global and Local Fusion Model for Image Enhancement
43Low-Level   Vision and lmage ProcessingA   Two-branch Fusion Network for Infrared and Visible Image Fusion
44Low-Level   Vision and lmage ProcessingAttention-Guided   Residual Fourier Transformation Network for Single Image Deblurring
45Low-Level   Vision and lmage ProcessingI3En:   A Multi-Level Iterative Low-Light Enhancement Network Based on Sketch  Prior Guidance
46Low-Level   Vision and lmage ProcessingEfficient   Stereo Matching Using Dynamic Graph
47Low-Level   Vision and lmage ProcessingSynergizing   Global and Local Knowledge via Dynamic Focus Mechanism for Low-Light Image   Enhancement
48Low-Level   Vision and lmage ProcessingTwo-Stage   Unsupervised Disentangled Realism Enhancement for Rendered Indoor Scene   Images
49Low-Level   Vision and lmage ProcessingCorner   Detection: Passive Non-Line-of-Sight Pedestrian Detection
50Low-Level   Vision and lmage ProcessingLabel-Correlation   Adaptive Central Similarity Hashing for Multi-Label Image Retrieval
513D   Vision and ReconstructionVisual   Harmony: LLM's Power in Crafting Coherent Indoor Scenes from Images
523D   Vision and ReconstructionSuperpixel   Cost Volume Excitation for Stereo Matching
533D   Vision and ReconstructionMulti-view   Depth Estimation with Adaptive Feature Extraction and Region-Aware Depth   Prediction
543D   Vision and Reconstruction3D   Data Augmentation for Driving Scenes on Camera
553D   Vision and ReconstructionA   Pose-Aware Auto-Augmentation Framework for 3D Human Pose and Shape Estimation   from Partial Point Clouds
563D   Vision and ReconstructionEfficient   Emotional Talking Head Generation via Dynamic 3D Gaussian Rendering
573D   Vision and ReconstructionGeneralizable   Geometry-aware Human Radiance Modeling from Multi-view Images
583D   Vision and ReconstructionAG-NeRF:   Attention-guided Neural Radiance Fields for Multi-height Large-scale Outdoor   Scene Rendering
593D   Vision and ReconstructionJPA:   A Joint-Part Attention for Mitigating Overfocusing on 3D Human Pose   Estimation
603D   Vision and ReconstructionRealistic   and Visually-pleasing 3D Generation of Indoor Scenes from a Single Image
613D   Vision and ReconstructionAttenPoint:   Exploring Point Cloud Segmentation through Attention-Based Modules
623D   Vision and ReconstructionMulti-view   3D Reconstruction by Fusing Polarization Information
633D   Vision and ReconstructionTrajectory-based   Calibration for Optical See-Through Head-Mounted Displays without Alignment
643D   Vision and ReconstructionMaximum   Spanning Tree for 3D Point Cloud Registration
653D   Vision and ReconstructionLearning   the Dynamic Spatio-Temporal Relationship Between Joints for 3D Human Pose   Estimation
663D   Vision and ReconstructionMaskEditor:   Instruct 3D Object Editing with Learned Masks
673D   Vision and ReconstructionDyGASR:   Dynamic Generalized Gaussian Splatting with Surface Alignment for Accelerated   3D Mesh Reconstruction
683D   Vision and ReconstructionMMIDM:Generating   3D Gesture from Multimodal Inputs with Diffusion Models
693D   Vision and ReconstructionMagicGS:   Combining 2D and 3D Priors for Effective 3D Content Generation
703D   Vision and ReconstructionESD-Pose:   Enhanced Semantic Discrimination for Generalizable 6D Pose Estimation
713D   Vision and ReconstructionTrans-DONeRF   for Transparent Object Rendering with Mixed Depth Prior
723D   Vision and ReconstructionSFDNeRF:   A Semantic Feature-Driven Few-Shot Neural Radiance Field Framework with   Hybrid Regularization
733D   Vision and ReconstructionTriEn-Net:   Non-parametric Representation Learning for Large-Scale Point Cloud Semantic   Segmentation
743D   Vision and ReconstructionDecomposed   Latent Diffusion Model for 3D Point Cloud Generation
753D   Vision and ReconstructionLearning   Multi-Branch Attention Networks for 3D Face Reconstruction
763D   Vision and ReconstructionCP-VoteNet:   Contrastive Prototypical VoteNet for Few-Shot Point Cloud Object Detection
773D   Vision and ReconstructionCross   Modality Fusion Network with Feature Alignment and Salient Object Exchange   for Single Image 3D Shape Retrieval
783D   Vision and ReconstructionEnhanced   Spatial Adaptive Fusion Network For Video Super-Resolution
793D   Vision and ReconstructionMulti-3D   Occlusion Mask Learning for Flexible Occlusion Removal in Neural Radiance   Fields
803D   Vision and ReconstructionSketch-Based   3D Shape Retrieval via Cross-Modal Contrastive Learning and Difficulty-Aware   Uncertainty Regularization
813D   Vision and ReconstructionResidual   Hybrid Attention Enhanced Video    Super-Resolution with Cross Convolution
823D   Vision and ReconstructionSDFReg:   Learning Signed Distance Functions for Point Cloud Registration
833D   Vision and ReconstructionUnfolding   Gradient Graph Regularization for Point Cloud Color Denoising
853D   Vision and ReconstructionMultimodal   Token Fusion and Optimization for 3D Human Mesh Reconstruction with   Transformers
86Vision   Applications and SystemsImmersive   6DOF Roaming with Novel View Synthesis from Single Outdoor Panorama
87Vision   Applications and SystemsDelving   Deeper into Clean Samples for Combating Noisy Labels
88Vision   Applications and SystemsVision-Language   Knowledge Exploration for Video Saliency Prediction
89Vision   Applications and SystemsVariational   Capsules for Image Analysis and Synthesis
90Vision   Applications and SystemsAn   Avatar-based Intervention System for Children with Autism Spectrum Disorder
91Vision   Applications and SystemsVS-LLM:   Visual-Semantic Depression Assessment based on LLM for Drawing Projection   Test
92Vision   Applications and SystemsA   Robust and Real-Time RGB-D SLAM Method with Dynamic Point Recognition and   Depth  Segmentation Optimization
93Vision   Applications and SystemsEdge-enhanced   super-resolution reconstruction of rock CT images
94Vision   Applications and SystemsMulti-Prototype   Co-Saliency Model for Plant Disease Detection
95Vision   Applications and SystemsAdvancing   Surveillance Video Clarity and Transmission: A Real-time Video   Super-Resolution Model with Background Information Awareness
96Vision   Applications and SystemsA   Novel Anti-rounding Image Steganography Method for Improved UNet++
97Vision   Applications and SystemsApple   Leaf Disease Segmentation in the Wild: A Multi-Task Collaborative Learning   Approach
98Vision   Applications and SystemsEG-Trans:   Transparent object segmentation with edge enhanced and global integrated   Transformers
99Vision   Applications and SystemsCNN-Transformer   with Stepped Distillation for Fine-Grained Visual Classification
100Vision   Applications and SystemsALMRR:   Anomaly Localization Mamba on Industrial Textured Surface with Feature   Reconstruction and Refinement
101Vision   Applications and SystemsAdaptive   Dual Attention Fusion Network for RGB-D Surface Defect Detection
102Vision   Applications and SystemsVehicle   Appearance Dataset
103Vision   Applications and SystemsRefineStyle:   Dynamic Convolution Refinement for StyleGAN
104Vision   Applications and SystemsFeature   Refinement and Calibration for Continual Visual Search
105Vision   Applications and SystemsJoint   Multi-Person Body Detection and Orientation Estimation via One Unified   Embedding
106Vision   Applications and SystemsULNet:   A Lightweight Segmentation Network For Lane    Detection
107Remote   Sensing lmage InterpretationShape-Aware   Soft Label Assignment and Context Enhancement for Oriented Object Detection
108Remote   Sensing lmage InterpretationChareption:   Change-Aware Adaption Empowers Large Language Model for Effective Remote   Sensing Image Change Captioning
109Remote   Sensing lmage InterpretationSpectral-Spatial   Multi-view Sparse Self-Representation for Hyperspectral Band Selection
110Remote   Sensing lmage InterpretationAdaptive   Cross-spatial Sensing Network for Change Detection
111Remote   Sensing lmage InterpretationHANet:   Hierarchical Attention Network for Remote Sensing Images Semantic   Segmentation
112Remote   Sensing lmage InterpretationBiReNet:   Bilateral Network with Feature Fusion and Edge Detection for Remote Sensing   Images Road Extraction
113Remote   Sensing lmage InterpretationLatent   Feature Representation-Based Low Rank Subspace Clustering for Hyperspectral   Band Selection
114Remote   Sensing lmage InterpretationDICMNet:   Dynamic Irregular Resnet with Multi-direction Channel Remapping for Remote   Sensing Road Extraction
115Remote   Sensing lmage InterpretationDiscriminative   Representation-based Classifier for Few-shot Remote Sensing Classification
116Remote   Sensing lmage InterpretationHyperspectral   Image Change Detection via Cross-Sample Slot Attention and Dual Gated   Feed-Forward Network
117Remote   Sensing lmage InterpretationSpectral   Channel-weighting CAT for Hyperspectral image Classification
118Remote   Sensing lmage InterpretationBFRNet:   Bimodal Fusion and Rectification Network for Remote Sensing Semantic   Segmentation
119Remote   Sensing lmage InterpretationSFFAFormer:   An Semantic Fusion and Feature Accumulation Approach for Change Detection on   Remote Sensing Images
120Remote   Sensing lmage InterpretationA   Novel Multi-scale Feature Fusion based Network for Hyperspectral and   Multispectral Image Fusion
121Remote   Sensing lmage InterpretationA   Sidelobe-Aware Semi-Deformable Convolutional Ship Detection Network for   Synthetic Aperture Radar Imagery
122Vision   Problems in Robotics, Autonomous DrivingEdge   Assisted Fast Optical Flow Matching SLAM in Underground Rescue Environments
123Vision   Problems in Robotics, Autonomous DrivingViPro-BEV:   Few-Shot  Visual Prompting for   Bird’s-Eye-View Perception
124Vision   Problems in Robotics, Autonomous DrivingDNIV-SLAM:   Neural Implicit Visual SLAM in Dynamic Environments
125Vision   Problems in Robotics, Autonomous DrivingBEVDot:   Enhancing Environmental Perception for Autonomous Driving with a Deformable   Depth Mechanism
126Vision   Problems in Robotics, Autonomous DrivingUDA-KB:   Unsupervised domain adaptation RGB-Thermal semantic segmentation via   knowledge bridge
128Vision   Problems in Robotics, Autonomous DrivingVTMF2N:   Towards Accurate Visual-tactile Slip Detection via Multi-modal Feature Fusion   in Robotic Grasping
129Vision   Problems in Robotics, Autonomous DrivingHHATP:   A Lightweight Heterogeneous Hierarchical Attention Model for Trajectory   Prediction
130Vision   Problems in Robotics, Autonomous DrivingImproved   End-to-End Multilevel NeRF-Based Dense RGB-D SLAM
131Vision   Problems in Robotics, Autonomous DrivingKA-Seg:   Improving LiDAR Point Cloud Segmentation through Key Point Sampling and   Attention-based Querying
132Vision   Problems in Robotics, Autonomous DrivingASPVNet:   Attention Based Sparse Point-Voxel Network for 3D Object Detection
133Vision   Problems in Robotics, Autonomous DrivingTask-Oriented   Scanpath Prediction with Spatial-Temporal Information in Driving Scenarios
134Vision   Problems in Robotics, Autonomous DrivingIntelligent   Navigation System that Gives Trajectory Guidance in 3D Scenes
135Vision   Problems in Robotics, Autonomous DrivingDynamic   Attention-Enhanced Spatio-Temporal Network For Pedestrian Collision Risk   Assessment
136Vision   Problems in Robotics, Autonomous DrivingDynamic   Object Suppression in Visual Odometry via Adaptive Masked Flow Refinement

1Neural   Network and Deep LearningAuto-USOD:   Searching Topology for Underwater Salient Object Detection
2Neural   Network and Deep LearningMBA-NER:   Multi-Granularity Entity Boundary-Aware Contrastive Enhanced for Two-stage   Few-Shot Named Entity Recognition
3Neural   Network and Deep LearningEnhancing   Zero-Shot Anomaly Detection: CLIP-SAM Collaboration with Cascaded Prompts
4Neural   Network and Deep LearningTowards   Adversarial-Robust Class-Incremental Learning via Progressively Volume-up   Perturbation Generation
5Neural   Network and Deep LearningNeighborhood   Difference-Enhanced Graph Neural Network based on Hypergraph for Social Bot   Detection
6Neural   Network and Deep LearningSRMAE:   Masked Image Modeling for Scale-Invariant Deep Representations
7Neural   Network and Deep LearningAn   Entropy-based Pseudo-Label Mixup Method for Source-Free Domain Adaptation
8Neural   Network and Deep LearningDAMS:   Document Image Steganography with Dual Attention Multi-Scale Encoder-Decoder   Architecture
9Neural   Network and Deep LearningDual-Task   Cascaded for Proactive Deepfake Detection Using QPCET watermarking
10Neural   Network and Deep LearningXrGroup:   Graph Convolutional Networks for Group-Aware Pedestrian Trajectory Prediction   with Speed information
11Neural   Network and Deep LearningInvisible   Backdoor Attack Through Singular Value Decomposition
12Neural   Network and Deep LearningSelf-supervised   transformer-based pre-training method with General Plant Infection dataset
13Neural   Network and Deep LearningSpatio-Temporal   Perceiving Network Based Vision Transformer for 6-Hour Precipitation   Prediction Using Multi-Meteorological Factors
14Neural   Network and Deep LearningLearning   Local Spatial and Global Context Activation for Visual Recognition
15Neural   Network and Deep LearningCRFNet:   A medical image segmentation method using the cross attention mechanism and   refined feature fusion strategy
16Neural   Network and Deep LearningSCC-CAM:   Weakly Supervised Segmentation on Brain Tumor MRI with Similarity Constraint   and Causality
17Neural   Network and Deep LearningGlobal   Structural Consistency Set Transformer
18Neural   Network and Deep LearningIMO-Net:   Integrated Memory Optimization Network for Video Instance Lane Detection
19Neural   Network and Deep LearningLightweight   Facial Expression Recognition Based on Hybrid Multiscale and Multi-Head   Collaborative Attention
20Neural   Network and Deep LearningSingle   model learns multiple styles of Chinese calligraphy via Style Collection   Mechanism
21Neural   Network and Deep LearningFusionNet   for Interactive Image Segmentation
22Neural   Network and Deep LearningDynamic   Spatial-Temporal Perception Graph Convolutional Networks for Traffic Flow   Forecasting
23Neural   Network and Deep LearningForeign   object classification for coal conveyor belts based on deep learning
24Neural   Network and Deep LearningInterpretable   Unsupervised Homography Estimation
25Neural   Network and Deep LearningDRC-NET:   Density Reweighted Convolution Network for Edge Curve Extraction
26Neural   Network and Deep LearningUnsupervised   Underwater Image Enhancement Combining Imaging Restoration and Prompt   Learning
27Neural   Network and Deep LearningGeneratice   Adversarial Imitation Learning Algorithm based on Improved Curiosity Module
28Neural   Network and Deep LearningZero-shot   Blind Face Restoration via Conditional Diffusion Sampling
29Neural   Network and Deep LearningTask-aware   Few-shot Image Generation via Dynamic Local Distribution Estimation and   Sampling
30Neural   Network and Deep LearningAdversarial   Training and Contrastive Learning with Bidirectional Transformers for   Sequence Recommendation
31Neural   Network and Deep LearningEmpathizing   Before Generation: A Double-layered Framework for Emotional Support LLM
32Neural   Network and Deep LearningST-RetNet:   A Long-term Spatial-Temporal Traffic Flow Prediction Method
33Neural   Network and Deep LearningDARTS-CGW:   Research on Differentiable Neural Architecture Search Algorithm Based on   Coarse Gradient Weighting
34Neural   Network and Deep LearningPanoDthNet:   Depth estimation based on indoor and outdoor panoramic images
35Neural   Network and Deep LearningA   Supervised Domain Adaptation Method with Alignment Regularization for   Low-light Facial Expression Recognition
36Neural   Network and Deep LearningDiffuSaliency:   Synthesizing Multi-Object Images with Masks for Semantic Segmentation Using   Diffusion and Saliency Detection
37Neural   Network and Deep LearningEFOA:   Enhancing Out-of-Distribution Detection by Fake Outlier Augmentation
38Neural   Network and Deep LearningA   Stereo Matching Method for Specular Objects via Cascaded Network and Joint   Supervision
39Neural   Network and Deep LearningAn   Asymmetric Game Theoretic Learning Model
40Neural   Network and Deep LearningLearning   360° Optical Flow using Tangent Images and Transformer
41Neural   Network and Deep LearningODAdapter:   An effective method of Semi-Supervised Object Detection for Aerial Images
42Neural   Network and Deep LearningFrequency-domain   Transformation-based Dynamic Gesture Recognition with skeleton
43Neural   Network and Deep LearningMRGN:   Multiscale Relation-gated Graph Network for Entity Alignment
44Neural   Network and Deep LearningAdaptive   Selective Knowledge Distillation: not blindly accepting teachers as Oracles
45Neural   Network and Deep LearningPeriodic   Iterative Segmentation-Colorization Training: Line Drawing Colorization Using   Text Tag with CBAMCat
46Neural   Network and Deep LearningHistogram   Prediction and Equalization for Indoor Monocular Depth Estimation
47Neural   Network and Deep LearningSheepNet:   Rapid Sheep Face Recognition Based on Attention and Knowledge Distillation
48Neural   Network and Deep LearningLPMANet:A   Lightweight Partial Multilayer Aggregation Network for Tiny Drone Detection
49Neural   Network and Deep LearningHiTraj:   Heterogeneous Interaction Learning with Transformers for Trajectory   Prediction
50Neural   Network and Deep LearningAdaptive   Knowledge Matching for Exemplar-Free Class-Incremental Learning
51Neural   Network and Deep LearningFocusing   on Significant Guidance: Preliminary Knowledge Guided Distillation
52Neural   Network and Deep LearningESTOR:Enumerate-Specify-Tutor   Mechanism Used of Lexicon in Chinese NER
53Neural   Network and Deep LearningEBSD:   Short Text Sentiment Classification Using Sentence Vector Enhancement   Mechanism
54Neural   Network and Deep LearningCEDP-YOLO:   UAV Object Detection Based on Context Enhancement and Dynamic Perception
55Neural   Network and Deep LearningTLLFusion:   An End-to-End Transformer-Based Method for Low-Light Infrared and Visible   Image Fusion
56Neural   Network and Deep LearningBD-YOLO   : High-precision lightweight concrete bubble detector based on YOLOv7
57Neural   Network and Deep LearningSemantic   Consistency-Enhanced Refined Hashing for Fine-Grained Image Retrieval
58Neural   Network and Deep LearningFrequency   Feature Enhanced Mix Calibration Attention Network for Sequential   Recommendation
59Neural   Network and Deep LearningCFMISA:   Cross-modal Fusion of Modal Invariant and Specific Representations for   Multimodal Sentiment Analysis
60Neural   Network and Deep LearningA   Privacy-Preserving Source Code Vulnerability Detection Method
61Neural   Network and Deep LearningPhysically   Informed Prior and Cross-Correlation Constraint for Fine-grained Road Crack   Segmentation
62Neural   Network and Deep LearningAFSNet:   Adaptive Feature Suppression Network for Remote Sensing Image Change   Detection
63Neural   Network and Deep LearningBIVL-Net:   Bidirectional Vision-Language Guidance for Visual Question Answering
64Neural   Network and Deep LearningEnhancing   Task Identification through Pseudo-OOD Features for Class-Incremental   Learning
65Neural   Network and Deep LearningContextual   Feature-Based Medical Visual Question Answering Aided by Learnable Matrix
66Neural   Network and Deep LearningImgQuant:   Towards adversarial defense with robust boundary via dual-image quantization
67Neural   Network and Deep LearningSwelling-ViT:   Rethink Data-efficient Vision Transformer from Locality
68Neural   Network and Deep LearningTarget-Specific   Domain Adaptation via Geometry-Correlation Prediction for Point Cloud
69Neural   Network and Deep LearningDual-stream   Network of Vision Mamba and CNN with Auto-scaling for Remote Sensing Image   Segmentation
70Neural   Network and Deep LearningA   Novel Combined GAN for Defects Generation using Masking Mechanisms
71Neural   Network and Deep LearningSemi-supervised   lightweight fabric defect detection
72Neural   Network and Deep LearningSemi-adaptive   Synergetic Two-way Pseudoinverse Learning System
73Neural   Network and Deep LearningInvariant   Risk Minimization Augmentation for Graph Contrastive Learning
74Neural   Network and Deep LearningEnhancing   Fast Adversarial Training with Learnable Adversarial Perturbations
75Neural   Network and Deep LearningDTAFORMER:   Directional Time Attention Transformer For Long-Term Series Forecasting
76Neural   Network and Deep LearningUnpaired   Multi-scenario Sketch Synthesis via Texture Enhancement
77Neural   Network and Deep LearningISO-VTON:   Fine-Grained Style-Local Flows with Dual Cross-Attention for Immersive   Outfitting
78Neural   Network and Deep LearningNear-surface   Air Temperature Inversion Study Based on U-Net Family with Multi-source Data
79Neural   Network and Deep LearningRelation   Detection with Transformers for Panoptic Scene Graph Generation
80Neural   Network and Deep LearningWEDNet:   A Wavelet Enhanced Detail Network for Low-Light Image Enhancement
81Neural   Network and Deep LearningTextureness-Aware   Neural Network for Edge Detection
82Neural   Network and Deep LearningEnhancing   the Transferability and Stealth of Deepfake Detection Attacks Through Latent   Diffusion Models
83Neural   Network and Deep LearningBackdoor   Richer Watermarks using Dynamic Mask Covering for Dual Identity Verification
84Neural   Network and Deep LearningPedestrian   Trajectory Prediction using Spatio-Temporal VAE
85Neural   Network and Deep LearningReal-Time   DEtection TRansformer with Bi-Level Routing Attention
86Neural   Network and Deep LearningNFP-UNet:   Deep Learning Estimation of Placeable Areas for 2D Irregular Packing
87Neural   Network and Deep LearningAdvancements   in Photorealistic Style Translation with a Hybrid  Generative Adversarial Network
88Neural   Network and Deep LearningTransformer   Image Quality Assessment Based on Multi-Directional Feature Extraction
89Neural   Network and Deep LearningMRGAN:   LightWeight Monaural Speech Enhancement using GAN Network
90Neural   Network and Deep LearningData   augmentation guided Decouple Knowledge Distillation for low-resolution   fine-grained image classification
91Neural   Network and Deep LearningVirtual   Student Distribution Knowledge Distillation for Long-tailed Recognition
92Neural   Network and Deep LearningOpen-Vocabulary   Instance Segmentation-Boundary IS-Goal
93Neural   Network and Deep Learning3DLaneFormer:   End-to-End 3D Lane Detection with Voxel Descriptors
94Neural   Network and Deep LearningMore   Like Real World Game Challenge for Partially Observable Multi-Agent   Cooperation
95Neural   Network and Deep LearningCentroid-centered   Modeling for Efficient Vision Transformer Pre-training
96Neural   Network and Deep LearningSpectral–Spatial   Blockwise Masked Transformer With Contrastive Multi-View Learning for   Hyperspectral Image Classification
97Neural   Network and Deep LearningLocal   reactivation for communication efficient federated learning based on sparse   gradient deviation
98Machine   LearningCluster   center initialization for fuzzy K-modes clustering using outlier detection   technique 
99Machine   LearningGeneralizing   soft actor-critic algorithms to discrete action spaces
100Machine   LearningLarvSeg:   Exploring Image Classification Data For Large Vocabulary Semantic   Segmentation via Category-wise Attentive Classifier
101Machine   LearningPhaseNN:   An Unsupervised and Spatial-Frequency Integrated Network for Phase Retrieval
102Machine   LearningSequential   Transfer of Pose and Texture for Pose Guided Person Image Generation
103Machine   LearningBalanced   Clustering with Discretely Weighted Pseudo-Label
104Machine   LearningTensor   Robust Principal Component Analysis with Hankel Structure
105Machine   LearningSelf-Distillation   via Intra-class Compactness
106Machine   LearningAn   Enhanced Dual-Channel-Omni-Scale 1DCNN for Fault Diagnosis
107Machine   LearningVisual-Guided   Reasoning Path Generation for Visual Question Answering
108Machine   LearningFedGC:   Federated Learning on Non-IID Data via Learning from Good Clients
109Machine   LearningInter-class   Correlation-based Online Knowledge Distillation
110Machine   LearningAccelerating   Domain Adaptation with Cascaded Adaptive Vision Transformer
111Machine   LearningMultistage   Compression Optimization Strategies for Accelerating Diffusion Models
112Machine   LearningDefending   Adversarial Patches via Joint Region Localizing and Inpainting
113Machine   LearningMulti-view   Spectral Clustering Based on Topological Manifold Learning
114Machine   LearningClient   selection mechanism for federated learning based on class imbalance
115Machine   LearningA   New Paradigm for Enhancing Ensemble Learning through Parameter   Diversification
116Machine   LearningAdaptive   Multi-Information Feature Fusion MLP with Filter Enhancement for Sequential   Recommendation
117Machine   LearningFedDCP:   Personalized Federated Learning Based on Dual Classifiers and Prototypes
118Machine   LearningAtomTool:   Empowering Large Language Models with Tool Utilization Skills
119Machine   LearningMaking   the Primary Task Primary: Boosting Few-Shot Classification by Gradient-biased   Multi-task Learning
120Machine   LearningCascade   Large Language Model via In-Context Learning for Depression Detection on   Chinese Social Media
121Machine   LearningTRAE   : Reversible Adversarial Example with Traceability
122Machine   LearningA   Two-stage Active Domain Adaptation Framework for Vehicle Re-Identification
123Machine   LearningFBR-FL:   Fair and Byzantine-Robust Federated Learning via SPD Manifold
124Machine   LearningSecBFL-IoV:   A Secure Blockchain-Enabled Federated Learning Framework for Resilience   against Poisoning Attacks in Internet of Vehicles
125Optimization   and Learning MethodsContinuous   Multi-Agent Path Finding for Drone Delivery
126Optimization   and Learning MethodsEnhancing   Multi-modal Contrastive Learning via Optimal Transport-based Consistent   Modality Alignment
127Optimization   and Learning MethodsInstance-level   Scaling and Dynamic Margin-alignment Knowledge Distillation
128Computational   Photography, Sensing and Display TechnologyLight   Field Stitching via Mesh Deformation Alignment and Low-Rank-Based Fusion
129Computational   Photography, Sensing and Display TechnologyCPE   COIN++: Towards Optimized Implicit Neural Representation Compression via   Chebyshev Positional Encoding
130Computational   Photography, Sensing and Display TechnologyLF-SAET:   Cascaded Spatial-Angular-EPI Transformers for Light Field Image   Super-Resolution
131Computational   Photography, Sensing and Display TechnologyA   New Data-Driven Paradigm for SAR Jamming Suppression
132Computational   Photography, Sensing and Display TechnologyAdaptive   Decoupled Prompting for Class Incremental Learning
