Encoder Decoder Transformer Architecture

Context-Aware Pedestrian Trajectory Prediction with Multimodal Transformer

Abstract: We propose a novel solution for predicting future trajectories of pedestrians. Our method uses a multimodal encoder-decoder transformer architecture, which takes as input both pedestrian ...

IEEE

ColonFormer: An Efficient Transformer Based Method for Colon Polyp Segmentation

Abstract: Identifying polyps is challenging for automatic analysis of endoscopic images in computer-aided clinical support systems. Models based on convolutional networks (CNN), transformers, and ...

GitHub

LEFormer: A Hybrid CNN-Transformer Architecture for Accurate Lake Extraction from Remote Sensing Imagery

The repository contains official PyTorch implementations of training and evaluation codes and pre-trained models for our ICASSP 2024 paper LEFormer. Figure 1: Overview architecture of LEFormer, ...

InfoQ

Gemma 4 12B Enables On-Device, Multimodal Agentic Workflows with an Encoder-free Architecture

A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...

GitHub

Trax — Deep Learning with Clear Code and Speed

We welcome contributions to Trax! We welcome PRs with code for new models and layers as well as improvements to our code and documentation. We especially love notebooks that explain how models work ...

Frontiers

DPCrossU-Net: a dual-branch parallel CNN–Transformer network for lung nodule segmentation

We propose DPCrossU-Net, a dual-branch parallel encoder–decoder network that integrates convolutional and Vision Transformer representations. The encoder employs parallel CNN and ViT branches with a ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results