1. Theme

Doctoral Forum

2. Purpose and Significance

This forum provides a platform for international exchange, learning and cooperation for PhD students, and an opportunity to communicate with experienced researchers in the field of computer vision and pattern recognition. Winners and official candidates of the 2020 CCF-CV Academic Emerging Award People will be directly invited to participate in the doctoral forum.

3. Schedule

Time: 15:20-17:20, October 18, 2020
Location: Multifunctional North Hall
Time Headline Speaker Host
15:20-15:50 Crafting Efficient Neural Networks through the Lens of Spatial Resolution Duo Li Tianzhu Zhang
15:50-16:20 Efficient network structure design for rich-scale scenes Shanghua Gao
16:20-16:50 Object instance segmentation and pose detection in complex environments Sida Peng Guangcan Liu
16:50-17:20 Efficient Representation Learning for Dense Object Detection Xiang Li

4. Proposed lecture content and introduction of invited speakers

Invited Report 1: Crafting Efficient Neural Networks through the Lens of Spatial Resolution
Summary: Though deep neural networks (DNNs) demonstrate dominant power in many computer vision tasks, they are accompanied with the price of high computational complexity. Therefore, neural network engineering towards better efficiency becomes more and more appealing recently, which allows flexible and fast deployment in real-world applications. In previous works related to efficient network design, adjusting network architecture from the perspective of network width, or say channel dimension, attracts most attention. In this talk, I will cover some of our recent works, which exploit efficient neural network design from the complementary spatial dimension, including tactfully shrinking intermediate feature resolution and dynamically switching input image resolution. Our approaches are able to achieve better performance while enabling lower computational budget or greater efficiency, compared to prior art.
  • 嘉宾简介:Duo Li is a second-year MPhil student in the Department of Computer and Science, The Hong Kong University of Science and Technology, under the supervision of Prof. Qifeng Chen. He received his B.Eng. degree from the Department of Automation, Tsinghua University, in 2019. His research interests broadly lie in computer vision and machine learning, especially neural network understanding and unsupervised learning. He has published 6 papers in top-tier computer vision conferences, including ICCV, CVPR and ECCV, during his first graduate studying year. Among his publications, he mostly serves as the first author (5 out of 6) as well as the only student author (4 out of 6). He has spent time at Intel Labs China, SenseTime research, and ByteDance AI Lab. His homepage is http://www.cs.ust.hk/~dlibh.

Invited Report 2: 面向富尺度场景的高效网络结构设计
Summary: 视觉场景中含有大量富尺度信息。对视觉场景中物体多尺度分布的表征是提升视觉任务性能的关键。最具代表性的SIFT特征提取虽能处理多尺度信息, 但是受限于手工设计的模式有限。而在深度神经网络结构演化过程中,多尺度表征能力的提升也隐含其中。本报告从多尺度特征获取的角度,介绍了 几个重要的深度神经网络结构的演化过程,并引出如何设计更高效的多尺度网络结构。报告介绍了如何从更加细粒度的层级提升网络多尺度特征表征能力, 以及如何构建能够自适应去除冗余的多尺度网络结构。报告还介绍了在多种视觉任务中高效多尺度表征网络结构的应用。
  • 嘉宾简介:高尚华,南开大学媒体计算实验室博士研究生一年级在读,师从程明明教授。 主要研究方向为神经网络结构设计,表征学习,以及场景分割等计算机视觉方向应用。先后在TPAMI,ICCV,CVPR 等期刊和会议发表8篇相关领域的文章, 论文Google scholar引用370余次。相关研究工作在包括抗击新冠肺炎疫情等场景中得到广泛应用。个人主页:http://shgao.site/

Invited Report 3:复杂环境下的物体实例分割与位姿检测
Summary: 物体检测分割和三维位姿获取是智能体对周围环境进行认知的核心能力,在增强现实和机器抓取等任务中有广泛应用。在实际场景中,目标物体往往处于复杂环境下, 比如被其他物体所遮挡,这给识别过程带来了很大干扰。本报告将介绍我们在该问题上的一些工作。报告首先介绍一个逐像素投票网络PVNet,通过稠密预测物体三维 位姿的中间表示形式,大大提升了遮挡情况下位姿估计的稳定性。报告第二部分将讨论物体实例分割的一种新思路DeepSnake,基于循环卷积的轮廓变形实现实时的实例分割。
  • 嘉宾简介:彭思达,浙江大学CAD&CG国家重点实验室三年级博士研究生,导师为周晓巍研究员。 研究方向为计算机视觉与增强现实,主要研究重建与物体位姿检测。博士至今发表2篇CVPR和1篇NeurIPS,其中2篇CVPR均为第一作者论文,并都录用为Oral Presentation。 发表论文均已开源,在GitHub上Star数总计1600余次。谷歌学术引用110余次。其中CVPR19发表的PVNet已被引用98次,曾占据公开数据集榜首近一年, 并在中国研究生人工智能创新大赛中获得第一名。

Invited Report 4: Efficient Representation Learning for Dense Object Detection
Summary: Learning efficient representation is crucial and fundamental for dense object detection. However, recent dense detectors face many problems in representation design, including (1) inconsistent usage of localization quality estimation and classification score between training and inference, (2) inflexible representation of bounding boxes, and (3) implicit and trivial feature representation for localization quality estimation. To address these problems, we propose to design new representations for dense object detectors, e.g., classification-IoU joint representation, distributed bounding box representation, and distribution-guided quality predictor, respectively. Furthermore, a novel Generalized Focal Loss is applied to learn these new representations successfully and efficiently, which achieves state-of-the-art performance among dense object detectors.
  • 嘉宾简介:Xiang Listarts his post-doctoral career in Nanjing University of Science and Technology as a candidate for the 2020 Postdoctoral Innovative Talent Program. He received the B.Eng. and PhD degree from Nanjing University of Science and Technology, supervised by Prof. Jian Yang and co-supervised by Prof. Xiaolin Hu. He was a research intern at Microsoft Research Asia, supervised by Tao Qin and Tie-yan Liu. He is a visiting scholar at Momenta. His team achieved the Champion of Alibaba Tianchi’s first big data competition, namely Ali mobile recommendation algorithm (1st from 7186 teams) and the Champion of Didi Tech’s first big data competition, namely the travel demand prediction algorithm (1st form 7664 teams). He has published 10 papers on CVPR, NeurIPs, AAAI, IJCAI and T-ITS as first or co-first author, with Google Scholar citation 610+. His representative works include Selective Kernel Networks (SKNets), Generalized Focal Loss (GFL) and the series of “Understanding the Disharmony”.

5. The organizer

  • 张天柱(tzzhang@ustc.edu.cn),中国科学技术大学教授,博导。2006年于北京理工大学获得学士学位,2011年于中国科学院 自动化研究所获得博士学位。博士毕业后,曾先后于美国伊利诺伊大学香槟分校新加坡高等研究院、阿卜杜拉国王科技大学和中国科学院自动化研究所从事科研工作。主要研究方向是 计算机视觉、模式识别、多媒体分析等。在国内外学术期刊和会议上发表论文90余篇,包括IEEE T-PAMI、IEEE T-IP、IJCV、CVPR、ICCV等ACM/IEEE汇刊和中国计算机学会(CCF) A类会议论文60余篇篇。论文谷歌学术引用6200余次,6篇论文入选ESI高被引。获China MM 2017最佳论文奖和中国计算机学会CCF-A类会议ACM MM 2016最佳论文奖。获中科院院长优秀奖, 中科院优博,中科院青促会,中国电子学会自然科学一等奖。担任IEEE TCSVT、Neurocomputing编委、CVIU客座编委,担任CVPR 2020领域主席、ICCV 2019领域主席、ACM MM 2020/ 2019领域主席、ECCV 2020领域主席、ICPR 2018领域主席、WACV 2018领域主席、MVA 2017领域主席、ICIMCS 2015出版主席等。

  • 刘光灿(gcliu@nuist.edu.cn),男,1982年出生于湖南省邵阳市。2004年在上海交通大学数学系获理学学士学位, 2010年在上海交通大学计算机科学与技术系获工学博士学位(导师:俞勇、林宙成、汤晓鸥)。2010至2014年间,先后在新加坡国立大学、美国伊利诺伊大学香槟校区、美国康奈尔大学从事 博士后研究工作。2014年回国,加入南京信息工程大学自动化学院,任教授,博士生导师。 主要研究领域是机器学习与计算机视觉,近年来在低秩学习理论与方法方面取得一系列成果。 2016年获国家基金委优青、江苏省杰青;2017年获教育部自然科学二等奖、吴文俊人工智能优秀青年奖、并入选ESI高被引学者榜单;2018年获江苏省高校自然科学一等奖。现为IEEE高级会员, 担任CCF A类会议AAAI、IJCAI的Senior PC Member,担任中国图象图形学会机器视觉专委会、江苏人工智能学会模式识别专委会等多个学术团体的常务委员。