Aspera: Moving the bio-omics data at maximum speed

Aspera: Moving the bio-omics data at maximum speed

Preface
Here, I want to develop this Subions into a platform that can promote biological workers in third world countries to apply bioinformatics to analyze biological data. On the other hand, I hope my Subions would become a bioinformatics belt and road. Each Wechat push will have both Chinese and English versions. In the current push, the high-speed data transfer tool name Aspera is mainly introduced. I hope this push would be meaningfully guide you guys on how to correctly install and use the software.

Introduction

    Today, with the development of bioinformatics and the cheapness of high-throughput sequencing, more and more multiomics data are being archived. The biological laboratories are facing greater challenges transferring large files and massive sets of data quickly and reliably between global and individuals. Failing to meet the challenges could limit the laboratory’s ability to meet scientific imperatives that yield obtained the huge scientific findings.     The current open-source alternatives software like Tsunami may work in the special and controlled network conditions. However, it a high cost to network efficiency. Therefore, the Aspera FASP technology is developed for transferring the large files in an efficient and low-cost way.

Aspera: Moving the bio-omics data at maximum speed

Installations

    The software Aspera could be installed in Linux, Windows and Mac systems. Here, we mainly focused on how to install it on Linux system since the the various analysis pipelines were mostly performed on this system. Thus, it is highly recommend to install a Unix-like system in advance if your laptop installed one of the other two system. The detailed codes about the installations are marked as follows:

wget https://ak-delivery04-mul.dhe.ibm.com/sar/CMA/OSA/092u0/0/ibm-aspera-connect-3.10.0.180973-linux-g2.12-64.tar.gz
tar xvf ibm-aspera-connect-3.10.0.180973-linux-g2.12-64.tar.gz
sh ibm-aspera-connect-3.10.0.180973-linux-g2.12-64.sh
cd  .aspera 

Application instance

    NCBI and EBI are the top two major biological data storage sites. Here, let’s started with downloading the raw genomic sequencing data from EBI via the Aspera software.
    Taking the sequencing data submitted by David et al as an example [1]. Open the EBI website and input the Bioproject number PRJEB36820. Download the text file from the place were I used the red arrow marked to archive the links.

Aspera: Moving the bio-omics data at maximum speed

    The detailed codes for downloading were shown as follows:

ascp -Q -T -l 100M -P 33001 -i asperaweb_id_dsa.putty era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/ERR395/007/ERR3957750/ERR3957750_1.fastq.gz . 

    According to my tests, an average speed with at least 10Mb per one second will reached, sometimes the speed will faster than 100Mb per one second.

[1] Keohane DM, Ghosh TS, Jeffery IB, Molloy MG, O'Toole PW, Shanahan F. Microbiome and health implications for ethnic minorities after enforced lifestyle changes. Nat Med. 2020 Jul;26(7):1089-1095. doi: 10.1038/s41591-020-0963-8. Epub 2020 Jul 6. PMID: 32632193.

Aspera: Moving the bio-omics data at maximum speed》来自互联网公开内容,收录仅供学习使用,如侵权请联系删除。本文URL:https://www.ezixuan.com/1020639.html

(0)
上一篇 2023年 1月 28日 上午9:04
下一篇 2023年 1月 28日 上午9:04