Figures and Tables from this paper
- figure 1
- table 1
- table 2
- table 3
- table 4
- table 6
- table 7
- table 9
- table 10
- table 11
- table 12
- table 15
Ask This Paper
BETA
AI-Powered
Ask This Paper
BETA
AI-Powered
Unknown Error
An unexpected error occurred. Please try again.
No Answer Found
Ask another question that can be answered by this paper or rephrase your question.
We are still processing this paper
Please try again later.
Question Answering Unavailable
Please try again later.
No Response
The server took too long to answer your question. You can either rephrase your question or wait until it is less busy.
AI-Generated
Thank you for your feedback!
We're sorry, something went wrong while submitting this feedback.
Thank you for your feedback!
We're sorry, something went wrong while submitting this feedback.
Supporting Statements
Our system tries to constrain to information found in this paper. Results quality may vary. Learn more about how we generate these answers.
Feedback?
4 Citations
- R. PeetersChristian Bizer
- 2023
Computer Science
ArXiv
It is shown that for use cases that do not allow data to be shared with third parties, open-source LLMs can be a viable alternative to hosted LLMs given that a small amount of training data or matching knowledge is required.
- Xi FangWeijie Xu Christos Faloutsos
- 2024
Computer Science
ArXiv
This survey aims to address the gap by consolidating recent progress in these areas, offering a thorough survey and taxonomy of the datasets, metrics, and methodologies utilized, while providing some insights for future research directions in this vital and rapidly evolving field.
- David SelbyKai Spriestersbach Sebastian Vollmer
- 2024
Computer Science, Linguistics
ArXiv
The feasibility of LLMs as a mechanism for quantitative knowledge retrieval to aid data analysis tasks such as elicitation of prior distributions for Bayesian models and imputation of missing data is explored.
- Ju FanJianhong Tu Nan Tang
Computer Science
The proposed Unicorn, a unified model for generally supporting common data matching tasks, adopts a mixture-of-experts model that enhances the learned representation into a better representation and can achieve better performance on most tasks and on average, compared with the state-of-the-art specific models trained for ad-hoc tasks and datasets separately.
- PDF
115 References
- Haochen ZhangYuyang DongChuan XiaoM. Oyamada
- 2023
Computer Science, Linguistics
ArXiv
An LLM-based framework for data preprocessing is proposed, which integrates cutting-edge prompt engineering techniques, coupled with traditional methods like contextualization and feature selection, to improve the performance and efficiency of these models.
- 4
- Highly Influential[PDF]
- Ariel N. LeeCole J. HunterNataniel Ruiz
- 2023
Computer Science
ArXiv
The Platypus family achieves strong performance in quantitative LLM metrics across model sizes, topping the global Open LLM leaderboard while using just a fraction of the fine-tuning data and overall compute that are required for other state-of-the-artfine-tuned LLMs.
- 64 [PDF]
- Xiang Lisa LiPercy Liang
- 2021
Computer Science
ACL
Prefix-tuning is proposed, a lightweight alternative to fine- Tuning for natural language generation tasks, which keeps language model parameters frozen and instead optimizes a sequence of continuous task-specific vectors, which is called the prefix.
- 2,464 [PDF]
- Zhoujun ChengJungo KasaiTao Yu
- 2023
Computer Science
EMNLP
Batch prompting, a simple yet effective prompting approach that enables the LLM to run inference in batches, instead of one sample at a time, is proposed, which reduces both token and time costs while retaining downstream performance.
- 31 [PDF]
- Patrick LewisEthan Perez Douwe Kiela
- 2020
Computer Science
NeurIPS
A general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation, and finds that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
- 1,727 [PDF]
- Simran AroraBrandon Yang Christopher Ré
- 2023
Computer Science
Proc. VLDB Endow.
This work proposes and evaluates Evaporate, a prototype system powered by large language models, and proposes an extended implementation, Evaporate-Code+, which achieves better quality than direct extraction, and proposes an extended implementation, Evaporate-Code+, which achieves better quality than direct extraction.
- 18 [PDF]
- Xiaoqi JiaoYichun Yin Qun Liu
- 2020
Computer Science
FINDINGS
A novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models is proposed and, by leveraging this new KD method, the plenty of knowledge encoded in a large “teacher” BERT can be effectively transferred to a small “student” TinyBERT.
- 1,425 [PDF]
- Zhengjie MiaoYuliang LiXiaolan Wang
- 2021
Computer Science
SIGMOD Conference
Rotom is a multi-purpose data augmentation framework for a range of data management and mining tasks including entity matching, data cleaning, and text classification that automatically learns a policy for combining examples from different DA operators, whereby combinatorially reduces the hyper-parameters space.
- 39
- PDF
- J. E. HuYelong Shen Weizhu Chen
- 2022
Computer Science
ICLR
Low-Rank Adaptation, or LoRA, is proposed, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.
- 3,039
- Highly Influential[PDF]
- S. LongpreLe Hou Adam Roberts
- 2023
Computer Science, Education
ICML
It is found task balancing and enrichment techniques are overlooked but critical to effective instruction tuning, and in particular, training with mixed prompt settings actually yields stronger performance in all settings.
- 293 [PDF]
...
...
Related Papers
Showing 1 through 3 of 0 Related Papers