[ad_1]
Learning from periodic data (signals that repeat, such as heartbeats or daily temperature changes at the Earth’s surface) is critical for many real-world applications, from monitoring weather systems to detecting vital signs. For example, in the field of remote sensing of the environment, periodic learning is often required to be able to display changes in the environment now, such as precipitation patterns or land surface temperature. In the health field, learning from video measurements has been shown to remove (quasi)periodic vital signs such as atrial fibrillation and sleep apnea episodes.
Approaches like RepNet highlight the importance of these types of tasks and provide a solution that recognizes repeated activities within a single video. However, these are supervised approaches that require a significant amount of data to capture repetitive activities, all labeled with the number of repetitions of the activity. Labeling such data is often difficult and resource-intensive, requiring researchers to manually capture gold-standard temporal measurements that are synchronized with the modalities of interest (eg, video or satellite imagery).
Alternatively, self-supervised learning (SSL) methods (eg SimCLR and MoCo v2) that use large amounts of unlabeled data to learn representations representing periodic or quasi-periodic temporal dynamics have shown success in solving classification tasks. However, they ignore intrinsic periodicity (ie, the ability to determine whether a frame is part of a periodic process) in the data and cannot learn robust representations that reflect periodic or frequency attributes. This is because periodic learning exhibits characteristics that are different from prevailing learning tasks.
Feature similarity is different in the context of periodic representations than static features (eg, pictures). For example, videos that are offset by a short time delay or inverted should be similar to the original sample, while videos that are upscaled or downscaled by a factor. x must differ from the original sample by a factor x. |
To address these challenges, in SimPer: Simple Self-Supervised Learning of Periodic Targets, published in the Eleventh International Conference on Learning Representations (ICLR 2023), we introduce a self-supervised contrast framework for learning periodic information from data. Specifically, SimPer exploits the temporal properties of periodic targets Temporal self-contrast learningwhere positive and negative samples are obtained by keeping the periodicity constant and increasing the periodicity variant the same Example input. We offer Similarity of periodic characteristics which uniquely defines how to measure similarity in the context of periodic learning. Moreover, we create a generalized contrast loss This extends the classic InfoNCE loss to a smooth regression variant that allows contrast over continuous labels (frequency). Next, we show that SimPer effectively learns period feature representations compared to state-of-the-art SSL methods, highlighting its intriguing properties, including better data efficiency, robustness to spurious correlations, and generalization to distributional variables. Finally, we are excited to release the SimPer code repo to the research community.
SimPer framework
SimPer introduces a temporal self-contrast learning framework. Positive and negative samples are obtained from the same input instance with periodicity-constant and periodicity-variant increased. For the time-lapse video examples, constant changes to periodicity are crop, rotate, or scroll, and changes to periodicity options include increasing or decreasing the video speed.
To clearly define how to measure similarity in the context of periodic learning, SimPer proposes periodic feature similarity. This construction allows us to frame training as a contrastive learning task. A model can be trained on data without any labels and then fine-tuned if necessary to map the learned features to specific frequency values.
The input sequence is given xWe know that there is an associated periodic signal. Then we transform x Create a series of rate or frequency-altered patterns that alter the underlying periodic target, thereby creating distinct negative views. Although the original frequency is unknown, we effectively generate pseudo-velocity or frequency labels for the unlabeled input x.
Conventional similarity measures, such as cosine similarity, emphasize strict proximity between two feature vectors and are sensitive to index-shifted features (which represent different timestamps), inverted features, and features with shifted frequencies. In contrast, periodic feature similarity should be high for samples with little temporal variation and or inverse indices, while capturing a continuous change in similarity as the feature frequency changes. This can be achieved in the frequency domain by means of a similarity metric, such as the distance between two Fourier transforms.
To exploit the intrinsic continuity of upscaled samples in the frequency domain, SimPer develops a generalized contrast loss that extends the classic InfoNCE loss to a soft regression variant that allows contrast over continuous labels (frequency). This makes it suitable for regression tasks where the goal is to recover a continuous signal such as heart rate.
SimPer generates negative views of the data through frequency domain transformations. input order x has an underlying periodic signal associated with it. SimPer transforms x Create a series of rate or frequency-altered patterns that alter the underlying periodic target, thereby creating distinct negative views. Although the original frequency is unknown, we effectively generate pseudo-rate or frequency labels for the unlabeled input x (Periodicity-variant strengthening T). SimPer accepts transformations that do not change the identity of the input and defines them as increasing periodicity invariantly. S, thus creating different positive views of the sample. Then, it sends these enhanced views to the encoder V, which will extract the relevant features. |
results
To evaluate the performance of SimPer, we compared it against state-of-the-art SSL schemes (e.g., SimCLR, MoCo v2, BYOL, CVRL) on six different periodic training datasets for common real-world tasks in human behavior analysis. Remote sensing of the environment and health protection. Specifically, below are the results from a video on measuring heart rate and counting exercise repetitions. The results show that SimPer outperforms state-of-the-art SSL schemes in all six datasets, highlighting its superior performance in terms of data efficiency, robustness to spurious correlations, and generalization to unseen targets.
Here we show quantitative results on two representative datasets using SimPer, pretrained using different SSL methods and refined with labeled data. First, we pre-train SimPer using Univ. Bourgogne Franche-Comté Remote PhotoPlethysmoGraphy (UBFC) dataset, human photoplethysmography and heart rate prediction dataset and compare its performance with state-of-the-art SSL methods. We can see that SimPer outperforms SimCLR, MoCo v2, BYOL and CVRL methods. Results on the human action counting database, Countix, further confirm the benefits of SimPer over other methods, as it significantly exceeds the baseline level of supervision. For feature evaluation results and performance on other datasets, please see the paper.
Results of SimCLR, MoCo v2, BYOL, CVRL and SimPer Univ. Bourgogne Franche-Comté Remote PhotoPlethysmoGraphy (UBFC) and Countix datasets. Heart rate and number of repetitions performance are reported as mean absolute error (MAE). |
Conclusion and applications
We present SimPer, a self-monitoring contrast framework for exploring periodic information in data. We show that by combining a temporal self-contrast learning framework, periodicity-invariant and periodicity-increasing variants, and continuous similarity of periodic features, SimPer provides an intuitive and flexible approach to learn robust feature representations of periodic signals. In addition, SimPer can be used in a variety of fields, from remote sensing of the environment to healthcare.
Acknowledgments
We would like to thank Yuzhe Yang, Xin Liu, Min-Jer Poh, Jiang Wu, Silviu Borach, and Dina Katabi for their contributions to this work.
[ad_2]
Source link