您的瀏覽器不支援JavaScript語法,網站的部份功能在JavaScript沒有啟用的狀態下無法正常使用。

中央研究院 資訊科學研究所

研究

友善列印

列印可使用瀏覽器提供的(Ctrl+P)功能

近期研究成果

:::

Test-Time Stain Adaptation with Diffusion Models for Histopathology Image Classification

European Conference on Computer Vision (ECCV), September 2024

Cheng-Chang Tsai, Yuan-Chih Chen, and Chun-Shien Lu

Chun-Shien Lu Cheng-Chang Tsai

Abstract

Stain shifts are prevalent in histopathology images, and typically dealt with by normalization or augmentation. Considering training-time methods are limited in dealing with unseen stains, we propose a test-time stain adaptation method (TT-SaD) with diffusion models that achieves stain adaptation by solving a nonlinear inverse problem during testing. TT-SaD is promising in that it only needs a single domain for training but can adapt well from other domains during testing, preventing models from retraining whenever there are new data available. For tumor classification, stain adaptation by TT-SaD outperforms state-of-the-art diffusion model-based test-time methods. Moreover, TT-SaD beats training-time methods when testing on data that are inaccessible during training. To our knowledge, the study of stain adaptation in diffusion model during testing time is relatively unexplored.

Transcriptomics and gut microbiome analysis of the edible herb Bidens pilosa as a functional feed additive to promote growth and metabolism in tilapia (Oreochromis spp.)

BMC Genomics, August 2024

Che-Chun Chen, Chung-Yen Lin, Hsin-Yun Lu, Chyng-Hwa Liou, Ying-Ning Ho, Chang-Wen Huang, Zhong-Fu Zhang, Chih-Hsin Kao, Wen-Chin Yang, Hong-Yi Gong

Chung-Yen Lin Chih-Hsin Kao

Abstract

Background

To reduce the use of antibiotics and chemicals in aquaculture, an edible herb - B. pilosa - has been selected as multifunctional feed additives to address this issue. Although there has been considerable research into the effects of B. pilosa on poultry, the wider effects, particularly on the growth and gut microbiota in fish, remain largely unexplored. We aim to investigate the interactive effects between the host on growth and the gut microbiota using transcriptomics and gut microbiota in B. pilosa-fed tilapia.

Results

In this study, we added 0.5% and 1% B. pilosa to the diet and observed that the growth performance of tilapia was significantly increased after 8 weeks of feeding. Comparative transcriptome analysis was performed on RNA sequence profiles obtained from liver and muscle tissues. Functional enrichment analysis showed that B. pilosa regulates several pathways and genes including amino acid metabolism, lipid metabolism, carbohydrate metabolism, endocrine system, signal transduction and metabolism of other amino acids. The expression of selected growth-associated genes was validated by qRT-PCR. The qRT-PCR result indicated that B. pilosa may enhance growth performance by activating the expression of liver igf1 and muscle igf1rb genes and inhibiting the expression of the muscle negative regulator myostatin b (mstnb). Enhancement of endocrine Igf1/Igf1rb signaling and suppression of Mstn signaling both induced the expression of myogenic regulatory factors (MRFs), myod1myogenin and mrf4, to promote muscle growth in tilapia. The predicted function of the gut microbiota showed several significantly different pathways that overlapped with the KEGG enrichment results of differentially expressed genes in the liver transcriptomes. This suggests that gut microbiota may be able to influence liver metabolism through the gut-liver axis in B. pilosa-fed tilapia.

Conclusions

In conclusion, dietary B. pilosa can regulate endocrine igf1 signaling and myostatin signaling to activate expression of MRFs to promoter muscle growth, and alter the composition of gut bacteria, which can then affect liver amino acid metabolism, carbohydrate metabolism, the endocrine system, lipid metabolism, metabolism of other amino acids, and signal transduction of the host, ultimately enhancing growth performance. Our results suggest that B. pilosa has the potential to be a functional additive that can be an alternative to reduce antibiotics as growth promoter in aquaculture organisms.

Learnable Layer Selection and Model Fusion for Speech Self-Supervised Learning Models

Interspeech2024, September 2024

Sheng-Chieh Chiu, Chia-Hua Wu, Jih-Kang Hsieh, Yu Tsao, and Hsin-Min Wang

Yu Tsao Hsin-Min Wang

Abstract

In this paper, we investigate methods for fusing feature representations derived from multiple speech self-supervised learning (SSL) models, along with techniques to determine the optimal layer within each model. We evaluate five fusing strategies, finding that temporal interleaved concatenation is the most robust and effective for the SUPERB ASR task. Additionally, we demonstrate that Gumbel layer selection can automatically select the most appropriate SSL layer with better performance than the commonly used weighted sum method. Furthermore, dimension-wise Gumbel layer selection shows promise in adaptive combination of layers of a single SSL model. Finally, we show that joint SSL model fusion and dimension-wise Gumbel layer selection further enhances effectiveness.

Learning Diffusion Models for Multi-View Anomaly Detection

18th European Conference on Computer Vision (ECCV), September 2024

Chieh Liu, Yu-Min Chu, Ting-I Hsieh, Hwann-Tzong Chen and Tyng-Luh Liu

Tyng-Luh Liu

Abstract

We are exploring an emerging formulation in anomaly detection (AD) where multiple instances of the same object are produced simultaneously and distinctly to address the limitation that using only a single instance may not effectively capture any underlying defects. More specifically, we concentrate on a specific scenario where each object of interest is linked to seven distinct data views/representations. The first six views involve capturing images with a stationary camera under six different lighting conditions, while the seventh view pertains to the 3D normal information. We refer to our intended task as {\\em multi-view anomaly detection}. To tackle this problem, our approach involves training a view-invariant ControlNet that can produce consistent feature maps regardless of the data views. This training strategy enables us to mitigate the impact of varying lighting conditions and to fuse information from both the RGB color appearance and the 3D normal geometry effectively. Moreover, as the diffusion process is not deterministic, we utilize the DDIM scheme to improve the applicability of our established memory banks of diffusion-based features for anomaly detection inference. To demonstrate the efficacy of our approach, we present extensive ablation studies and state-of-the-art experimental results on the Eyecandies dataset.

Non-Intrusive Speech Intelligibility Prediction for Hearing Aids using Whisper and Metadata

Interspeech2024, September 2024

Ryandhimas E. Zezario, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, and Yu Tsao

Yu Tsao Hsin-Min Wang

Abstract

Automated speech intelligibility assessment is pivotal for hearing aid (HA) development. In this paper, we present three novel methods to improve intelligibility prediction accuracy and introduce MBI-Net+, an enhanced version of MBI-Net, the top-performing system in the 1st Clarity Prediction Challenge. MBI-Net+ leverages Whisper’s embeddings to create crossdomain acoustic features and includes metadata from speech signals by using a classifier that distinguishes different enhancement methods. Furthermore, MBI-Net+ integrates the hearingaid speech perception index (HASPI) as a supplementary metric into the objective function to further boost prediction performance. Experimental results demonstrate that MBI-Net+ surpasses several intrusive baseline systems and MBI-Net on the Clarity Prediction Challenge 2023 dataset, validating the effectiveness of incorporating Whisper embeddings, speech metadata, and related complementary metrics to improve prediction performance for HA.

Pseudo-Embedding for Generalized Few-Shot 3D Segmentation

18th European Conference on Computer Vision (ECCV), September 2024

Chih-Jung Tsai, Hwann-Tzong Chen and Tyng-Luh Liu

Tyng-Luh Liu

Abstract

Existing generalized few-shot 3D segmentation (GFS3DS) methods typically prioritize enhancing the training of base-class prototypes while neglecting the rich semantic information within background regions for future novel classes. We introduce a novel GFS3DS learner that strategically leverages background context to improve both base prototype training and few-shot adaptability. Our method employs foundation models to extract semantic features from background points and grounds on text embeddings to cluster background points into pseudo-classes. This approach facilitates clearer base/novel class differentiation and generates pseudo prototypes that effectively mimic novel support samples. Comprehensive experiments on S3DIS and ScanNet datasets demonstrate the state-of-the-art performance of our method in both 1-shot and 5-shot tasks. Our approach significantly advances GFS3DS by unlocking the potential of background context, offering a promising avenue for broader applications.

SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models

Interspeech2024, September 2024

Chun Yin, Tai-Shih Chi, Yu Tsao, and Hsin-Min Wang

Yu Tsao Hsin-Min Wang

Abstract

Representations from pre-trained speech foundation models (SFMs) have shown impressive performance in many downstream tasks. However, the potential benefits of incorporating pre-trained SFM representations into speaker voice similarity assessment have not been thoroughly investigated. In this paper, we propose SVSNet+, a model that integrates pre-trained SFM representations to improve performance in assessing speaker voice similarity. Experimental results on the Voice Conversion Challenge 2018 and 2020 datasets show that SVSNet+ incorporating WavLM representations shows significant improvements compared to baseline models. In addition, while fine-tuning WavLM with a small dataset of the downstream task does not improve performance, using the same dataset to learn a weighted-sum representation of WavLM can substantially improve performance. Furthermore, when WavLM is replaced by other SFMs, SVSNet+ still outperforms the baseline models and exhibits strong generalization ability.

One-Class Face Anti-spoofing via Spoof Cue Map-Guided Feature Learning

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024

Pei-Kai Huang, Cheng-Hsuan Chiang, Tzu-Hsien Chen, Jun-Xiong Chong, Tyng-Luh Liu and Chiou-ting Hsu

Tyng-Luh Liu

Abstract

Many face anti-spoofing (FAS) methods have focused on learning discriminative features from both live and spoof training data to strengthen the security of face recognition systems. However, since not every possible attack type is available in the training stage, these FAS methods usually fail to detect unseen attacks in the inference stage. In comparison, one-class FAS, where training data comprise only live faces, aims to detect whether a test face image belongs to the live class or not. In this paper, we propose a novel One-Class Spoof Cue Map estimation Network (OC-SCMNet) to address the one-class FAS detection problem. Our first goal is to learn to extract latent spoof features from live images so that their estimated Spoof Cue Maps (SCMs) should have zero responses. To avoid trapping to a trivial solution, we devise a novel SCM-guided feature learning by combining many SCMs as pseudo ground-truths to guide a conditional generator to create latent spoof features for spoof data. Our second goal is to simulate the potential out-of-distribution spoof attacks approximately. To this end, we propose using a memory bank to dynamically preserve a set of sufficiently “independent” latent spoof features to encourage the generator to probe the latent spoof feature space. Extensive experiments conducted on eight FAS benchmark datasets demonstrate that the proposed OC-SCMNet not only outperforms previous one-class FAS approaches but also achieves performance comparable to the state-of-the-art two-class FAS methods.

Bridging Actions: Generate 3D Poses and Shapes In-between Photos

IEEE Transactions on Pattern Analysis and Machine Intelligence, April 2024

Wen-Li Wei and Jen-Chun Lin

Wen-Li Wei Jen-Chun Lin

Abstract

Generating realistic 3D human motion has been a fundamental goal of the game/animation industry. This work presents a novel transition generation technique that can bridge the actions of people in the foreground by generating 3D poses and shapes in-between photos, allowing 3D animators/novice users to easily create/edit 3D motions. To achieve this, we propose an adaptive motion network (ADAM-Net) that effectively learns human motion from masked action sequences to generate kinematically compliant 3D poses and shapes in-between given temporally-sparse photos. Three core learning designs underpin ADAM-Net. First, we introduce a random masking process that randomly masks images from an action sequence and fills masked regions in latent space by interpolation of unmasked images to simulate various transitions under given temporally-sparse photos. Second, we propose a long-range adaptive motion (L-ADAM) attention module that leverages visual cues observed from human motion to adaptively recalibrate the range that needs attention in a sequence, along with a multi-head cross-attention. Third, we develop a short-range adaptive motion (S-ADAM) attention module that weightedly selects and integrates adjacent feature representations at different levels to strengthen temporal correlation. By coupling these designs, the results demonstrate that ADAM-Net excels not only in generating 3D poses and shapes in-between photos, but also in classic 3D human pose and shape estimation.

Contrastive Learning for DeepFake Classification and Localization via Multi-Label Ranking

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024

Cheng-Yao Hong, Yen-Chi Hsu and Tyng-Luh Liu

Tyng-Luh Liu

Abstract

We propose a unified approach to simultaneously addressing the conventional setting of binary deepfake classification and a more challenging scenario of uncovering what facial components have been forged as well as the exact order of the manipulations. To solve the former task, we consider multiple instance learning (MIL) that takes each image as a bag and its patches as instances. A positive bag corresponds to a forged image that includes at least one manipulated patch (i.e., a pixel in the feature map). The formulation allows us to estimate the probability of an input image being a fake one and establish the corresponding contrastive MIL loss. On the other hand, tackling the component-wise deepfake problem can be reduced to solving multi-label prediction, but the requirement to recover the manipulation order further complicates the learning task into a multi-label ranking problem. We resolve this difficulty by designing a tailor-made loss term to enforce that the rank order of the predicted multi-label probabilities respects the ground-truth order of the sequential modifications of a deepfake image. Through extensive experiments and comparisons with other relevant techniques, we provide extensive results and ablation studies to demonstrate that the proposed method is an overall more comprehensive solution to deepfake detection.

Effective Compression of Language Models by Combining Pruning and Knowledge Distillation

IEEE International Conference on Computers, Software, and Applications (COMPSAC), July 2024

Chi-Yu Chiu, Ding-Yong Hong, Pangfeng Liu and Jan-Jan Wu

Ding-Yong Hong Jan-Jan Wu

Abstract

Weight pruning is a prominent model compression technique that removes some weights in a model. However, pruning on transformer models faces a challenge. After pruning, Transformer models require repeating the whole training process, including pre-training on a large generalized data set and fine-tuning on a small downstream data set, to recover the accuracy. The whole training process takes a long time and many computation resources. To address the challenge, we propose a pruning method that combines with knowledge distillation to avoid a long re-training time while recovering the accuracy. We use 2:4 pruning as our basic pruning method. 2:4 pruning is a method proposed by NVIDIA that keeps two larger absolute values in every four consecutive elements in every row in a weight matrix. We generalize 2:4 pruning to N:M pruning which refers to keeping N larger absolute values in every M consecutive elements in every row in a weight matrix. Knowledge distillation is another model compression method that makes a small model, which is referred to as a student, learn from a large model, which is referred to as a teacher. With our method, we use N:M pruning to uniformly prune the model into N:M structure. Next, we use two-stage fine-tuning on the downstream dataset with knowledge distillation. By using our method, the pruned models can achieve comparable accuracy by using only downstream datasets and take much less time than traditional retraining. We run our experiments on SQuAD and GLUE datasets using DistilBERT. The experimental results show that DistilBERT in a 1:4 structure can achieve comparable accuracy on the SQuAD v1.1 and SQuAD v2.0 datasets and 1.7x speedup on inference compared to the original dense model.

A Study on Incorporating Whisper for Robust Speech Assessment

IEEE International Conference on Multimedia and Expo (ICME), July 2024

Ryandhimas E. Zezario, Yu-Wen Chen, Szu-Wei Fu, Yu Tsao, Hsin-Min Wang, and Chiou-Shann Fuh

Yu Tsao Hsin-Min Wang

Abstract

This research introduces an enhanced version of the multi-objective speech assessment model--MOSA-Net+, by leveraging the acoustic features from Whisper, a large-scaled weakly supervised model. We first investigate the effectiveness of Whisper in deploying a more robust speech assessment model. After that, we explore combining representations from Whisper and SSL models. The experimental results reveal that Whisper's embedding features can contribute to more accurate prediction performance achieved by MOSA-Net+. Moreover, combining the embedding features from Whisper and SSL models only leads to marginal improvement. As compared to intrusive methods, MOSA-Net, and other SSL-based speech assessment models, MOSA-Net+ yields notable improvements in estimating subjective quality and intelligibility scores across all evaluation metrics in Taiwan Mandarin Hearing In Noise test - Quality \& Intelligibility (TMHINT-QI) dataset. To further validate its robustness, MOSA-Net+ was tested in the noisy-and-enhanced track of the VoiceMOS Challenge 2023, where it obtained the top-ranked performance among nine systems.

Generating Attractive and Authentic Copywriting from Customer Reviews

NAACL 2024, June 2024

Yu-Xiang Lin, Wei-Yun Ma

Yu-Xiang Lin Wei-Yun Ma

Abstract

The goal of product copywriting is to capture the interest of users and enhance their experience by emphasizing the features of products through text descriptions. As e-commerce platforms offer a wide range of services, it's becoming essential to dynamically adjust the styles of these auto-generated descriptions. Traditional approaches to copywriting generation often rely solely on specified product attributes, which may result in dull and repetitive content. To tackle this issue, we propose to generate copywriting based on customer reviews, as they provide firsthand practical experiences with the product, offering a richer source of information than just product attributes. We have developed a sequence-to-sequence framework, enhanced with reinforcement learning, to produce copywriting that is attractive, authentic, and rich in information. Our framework outperforms all existing baseline and zero-shot large language models, including LLaMA-2-chat-7B and GPT-3.5\footnote{In this work, we use gpt-3.5-turbo-0613. }, in terms of both attractiveness and faithfulness. Furthermore, this work features the use of LLMs for aspect-based summaries collection and argument allure assessment. Experiments demonstrate the effectiveness of using LLMs for marketing domain corpus construction. The dataset will be made available in the future.

Plug-in Language Model: Controlling Text Generation with a Simple Regression Model

NAACL 2024, June 2024

Nai-Chi Yang, Wei-Yun Ma, Pu-Jen Cheng

Nai-Chi Yang Wei-Yun Ma

Abstract

Large-scale pre-trained language models have displayed unrivaled capacity in generating text that closely resembles human-written text. Nevertheless, generating texts adhering to specific conditions without fine-tuning or adding new parameters can be challenging.
Contemporary approaches commonly rely on either prompts or auxiliary models to avoid modifying the language models. These auxiliary models are designed to assess whether a generated token contributes to meeting the desired requirements. These approaches adjust the distribution of the next token during the inference phase by leveraging the prediction score of the desired attribute to calculate gradients. However, these auxiliary models typically require the language model's latent states. This prerequisite challenges integrating various existing black box attribute models or tools.
We present the Plug-in Language Model (PiLM) as a solution to address the limitations. PiLM leverages reinforcement learning to utilize black box tools directly, adjusting the latent state to control text generation. However, performing backpropagation during the inference phase is time-consuming for PiLM. By replacing backpropagation with a simple regression model, PiLM can achieve an inference time comparable to that of the original LLM.
Experiment results show that our approaches in this paper outperform existing state-of-the-art methods that rely on gradient-based, weighted decoding, or prompt-based methodologies.

Automatic Construction of a Chinese Review Dataset for Aspect Sentiment Triplet Extraction via Iterative Weak Supervision

LREC-COLING 2024, May 2024

Chia-Wen Lu, Ching-Wen Yang, Wei-Yun Ma

Chia-Wen Lu Wei-Yun Ma

Abstract

Aspect Sentiment Triplet Extraction (ASTE), introduced in 2020, is a task that involves the extraction of three key elements: target aspects, descriptive opinion spans, and their corresponding sentiment polarity. This process, however, faces a significant hurdle, particularly when applied to Chinese languages, due to the lack of sufficient datasets for model training, largely attributable to the arduous manual labeling process. To address this issue, we present an innovative framework that facilitates the automatic construction of ASTE via Iterative Weak Supervision, negating the need for manual labeling, aided by a discriminator to weed out subpar samples. The objective is to successively improve the quality of this raw data and generate supplementary data. The effectiveness of our approach is underscored by our results, which include the creation of a substantial Chinese review dataset. This dataset encompasses over 60,000 Google restaurant reviews in Chinese and features more than 200,000 extracted triplets. Moreover, we have also established a robust baseline model by leveraging a novel method of weak supervision. Both our dataset and model are openly accessible to the public.

Decoding the genome of bloodsucking midge Forcipomyia taiwana (diptera: Ceratopogonidae): Insights into odorant receptor expansion

Insect Biochemistry and Molecular Biology, May 2024

Ming-Der Lin, Chia-Hsien Chuang, Chih-Hsin Kao, Shu-Hwa Chen, Szu-Chieh Wang, Ping-Heng Hsieh, Guan-Yu Chen, Chun-Chia Mao, Jeng-Yi Li, Mei-Yeh Jade Lu, Chung-Yen Lin

Ming-Der Lin Chia-Hsien Chuang Shu-Hwa Chen Szu-Chieh Wang Ping-Heng Hsieh Chun-Chia Mao Chung-Yen Lin

Abstract

Biting midges, notably those within the Ceratopogonidae family, have long been recognized for their epidemiological significance, both as nuisances and vectors for disease transmission in vertebrates. Despite their impact, genomic insights into these insects, particularly beyond the Culicoides genus, remain limited. In this study, we assembled the Forcipomyia taiwana (Shiraki) genome, comprising 113 scaffolds covering 130.4 Mbps—with the longest scaffold reaching 7.6 Mbps and an N50 value of 2.6 Mbps—marking a pivotal advancement in understanding the genetic architecture of ceratopogonid biting midges. Phylogenomic analyses reveal a shared ancestry between F. taiwana and Culicoides sonorensis Wirth & Jones, dating back approximately 124 million years, and highlight a dynamic history of gene family expansions and contractions within the Ceratopogonidae family. Notably, a substantial expansion of the odorant receptor (OR) gene family was observed, which is crucial for the chemosensory capabilities that govern biting midges' interactions with their environment, including host seeking and oviposition behaviors. The distribution of OR genes across the F. taiwana genome displays notable clusters on scaffolds, indicating localized tandem gene duplication events. Additionally, several collinear regions were identified, hinting at segmental duplications, inversions, and translocations, contributing to the olfactory system's evolutionary complexity. Among the 156 ORs identified in F. taiwana, 134 are biting midge-specific ORs, distributed across three distinct clades, each exhibiting unique motif features that distinguish them from the others. Through weighted gene co-expression network analysis, we correlated distinct gene modules with sex and reproductive status, laying the groundwork for future investigations into the interplay between gene expression and adaptive behaviors in F. taiwana. In conclusion, our study not only highlights the unique olfactory repertoire of ceratopogonid biting midges but also sets the stage for future studies into the genetic underpinnings of their unique biological traits and ecological strategies.

Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model

IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP2024), April 2024

Ryandhimas Zezario, Bo-Ren Brian Bai, Chiou-Shann Fuh, Hsin-Min Wang, and Yu Tsao

Hsin-Min Wang

Abstract

This study proposes a multi-task pseudo-label learning (MPL)-based non-intrusive speech quality assessment model called MTQ-Net. MPL consists of two stages: obtaining pseudolabel scores from a pretrained model and performing multitask learning. The 3QUEST metrics, namely Speech-MOS (S-MOS), Noise-MOS (N-MOS), and General-MOS (GMOS), are the assessment targets. The pretrained MOSA-Net model is utilized to estimate three pseudo labels: perceptual evaluation of speech quality (PESQ), short-time objective intelligibility (STOI), and speech distortion index (SDI). Multi-task learning is then employed to train MTQ-Net by combining a supervised loss (derived from the difference between the estimated score and the ground-truth label) and a semi-supervised loss (derived from the difference between the estimated score and the pseudo label), where the Huber loss is employed as the loss function. Experimental results first demonstrate the advantages of MPL compared to training a model from scratch and using a direct knowledge transfer mechanism. Second, the benefit of the Huber loss for improving the predictive ability of MTQ-Net is verified. Finally, the MTQ-Net with the MPL approach exhibits higher overall predictive power compared to other SSL-based speech assessment models.

A formal treatment of bidirectional typing

33rd European Symposium on Programming (ESOP 2024), April 2024

Chen, Liang-Ting and Ko, Hsiang-Shang

Chen, Liang-Ting Ko, Hsiang-Shang

Abstract

There has been much progress in designing bidirectional type systems and associated type synthesis algorithms, but mainly on a case-by-case basis. To remedy the situation, this paper develops a general and formal theory of bidirectional typing, and, as a by-product of our formalism, provides a verified generator of proof-relevant type synthesisers for simply typed languages: for every signature that specifies a mode-correct bidirectionally typed language, there exists a proof-relevant type synthesiser that for an input abstract syntax tree constructs a typing derivation if any, gives its refutation if not, or reports that the input does not have enough type annotations. Soundness, completeness, and mode-correctness are studied universally for all signatures, which are sufficient conditions for deriving a type synthesiser. We propose a preprocessing step called mode decoration, which helps the user to deal with missing type annotations in a given abstract syntax tree. The entire development is formalised in Agda and can be further integrated with other language-formalisation frameworks.