Home

Driving Hazard Prediction by Multi-modal AI

Driving Hazard Prediction by Multi-modal AI

In recent years, with the advancement of autonomous driving technologies and advanced driver-assistance systems (ADAS), predicting hazards in the vicinity of vehicles has become a critical issue for safe driving. Conventional methods have relied on video

Read More
Zero-shot Texture Anomaly Detection

Zero-shot Texture Anomaly Detection

In recent years, anomaly detection (AD) in images has become increasingly important in industrial inspection and quality control. In particular, detecting anomalies in texture images has encountered a challenge: conventional methods assume the availability of numerous

Read More
Bridge Inspection by Multi-modal AI

Bridge Inspection by Multi-modal AI

This paper focuses on enhancing visual question answering (VQA) for bridge inspection using multimodal AI techniques that process both images and natural language. Traditionally, bridge inspections rely on expert visual assessments, which are time-consuming, costly, and

Read More
SBCFormer: An Image Recognition Model for Single Board Computers

SBCFormer: An Image Recognition Model for Single Board Computers

In recent years, deep learning-based image recognition has expanded into practical applications such as agriculture, fisheries, and livestock management. In these domains, low-cost and low-power systems are often more important than high-speed processing, making single board

Read More
GRIT: Transformer-based Image Captioning Leveraging Grid and Region Features

GRIT: Transformer-based Image Captioning Leveraging Grid and Region Features

“Image captioning,” the task of describing the scenery and objects in an image using natural language, is one of the technologies in artificial intelligence that enables visual information to be expressed in words. In recent mainstream

Read More
Image Anomaly Detection via Local and Global Knowledge Integration

Image Anomaly Detection via Local and Global Knowledge Integration

This study proposes a novel method for high-precision detection of “logical anomalies” (e.g., misplacements or omissions of parts that depend on the overall contextual information of an image) in applications such as industrial inspection. Conventional anomaly

Read More
Unsupervised Domain Adaptation for Semantic Segmentation

Unsupervised Domain Adaptation for Semantic Segmentation

This paper proposes a novel method called “Cross-Region Adaptation (CRA)” aimed at improving the accuracy of unsupervised domain adaptation (UDA) for semantic segmentation. Semantic segmentation, which assigns semantic labels to each pixel in an image, is

Read More
High-Speed, High-Precision Visual Localization via Integrated Local Feature Aggregation

High-Speed, High-Precision Visual Localization via Integrated Local Feature Aggregation

Visual localization is a critical task in many computer vision applications such as Structure-from-Motion (SfM) and SLAM, as it involves estimating the 6-DoF camera pose. Traditional approaches extract global features for image retrieval and local features

Read More
Symmetry-Aware Architecture for Enhanced Generalization in Embodied Visual Navigation

Symmetry-Aware Architecture for Enhanced Generalization in Embodied Visual Navigation

Embodied visual navigation, which is crucial in fields such as autonomous robotics and augmented reality, enables a robot to navigate and search for target objects in an unknown environment while localizing itself. However, existing deep reinforcement

Read More
A Graph Network Approach to Fast Bundle Adjustment for Optimized SLAM

A Graph Network Approach to Fast Bundle Adjustment for Optimized SLAM

In the fields of Structure-from-Motion (SfM) and visual SLAM (Simultaneous Localization and Mapping), Bundle Adjustment (BA) is a crucial process that optimizes camera poses and the positions of 3D landmarks. In practice, many visual SLAM systems

Read More