WebNovelGenomnder50.00%

Genomnder

Genoumunder ,Skip to Main content

Genome Research

Related terms:

Ontology

Bioinformatics

Profiling

Human Genome Project

View all Topics

Human Genome Project: Japanese Perspective

T. Gojobori, in International Encyclopedia of the Social & Behavioral Sciences, 2001

2 The Present Situation of the Human Genome Project in Japan

As genome research advances internationally, there has been increasing recognition that genome research constitutes the necessary foundation for the future development of biology, medical research, and biotechnology. In particular, the necessity of establishing a large-scale genome-sequencing center has been accepted widely. However, the realization of this plan has been difficult because of budget limitations.

Since the basic law of science and technology development was passed in 1997, the budget limitations have, to some extent, disappeared under the national goal that Japan should become a country of creative science and technology. Because of this basic law, the government invested approximately $US 200 million in the genome research project in the 1998 fiscal year alone. Presently, various activities in JHGP are conducted under the initiative of governmental organizations and private industries. MONBUSHO successfully established the Center for Information Biology in NIG to support DDBJ and the newly established Human Genome Research Center in Tokyo University. The Ministry of International Trade and Industry (MITI) and the MHW coordinated a promotion business venture of genome-related enterprise and the STA created the genome science center in Yokohama. Moreover, large research grants were made to many genome research groups through various funding mechanisms. However, not all of these activities are along the lines of well-planned strategies of genome research.

The general focus of JHGP is toward the determination of large-scale genome sequences and a functional analysis of the genes that will be discovered by sequence determination. JHGP is placing more emphasis on the areas of genomic structure, function, and information. As for genome sequence determination, Kitazato University, Japan Marine Science and Technology Center, and STA and the Product Evaluation Center under MITI are pushing the genome sequencing of microorganisms by producing 122Mb each year. In 1995, the Japanese Science and Technology Corporation initiated the human genome-sequencing project. The subgenomic segments such as HLA, immunogloblin variable, Down's syndrome, and oncogene-related regions, have been determined for about 15Mb by Kitazato University, Cancer Institute, Keio University and Tokai University.

In total, the sequence production in Japan is about 5Mb per year, which is one order of magnitude smaller than that of the Sanger Center in the United Kingdom and the Sequence Center at Washington University in the United States. Therefore, JHGP has great expectations of the role of the newly established Genome Science Center, which is planned to produce 30–50Mb per year. Kazusa Research Institute is producing several Mb genome sequences of Arabidopsis per year. Moreover, the genome project team supported by MAFF has a ten-year plan of sequence determination of the complete rice genome of 400Mb, producing 20Mb per year for the initial several years. In summary, the sequencing capacity of genomes in Japan has so far been 15–20Mb per year. Judging from recent increases of the number of genome research groups focusing on bacteria, fungi, lower organisms, insects, and fish and the enhancement of genome research on humans and the mouse, 200Mb sequence production per year is expected in the near future.

JHGP has a slight edge over other countries in the field of functional genomics. In particular, the Osaka University genome team was the first to analyze a cDNA expression profile of human genetic data (called 'body mapping') and other genome teams have quickly developed a full-length cDNA determination technique. Although these activities were not directly connected to a bioindustry as they are in the United States, MITI recently pushed the utilization of these technical advances to industry. Moreover, human genome diversity has become recognized as important for disease gene analysis and functional analysis. In particular, the so-called 'complex diseases' such as multi component diseases are becoming important targets of the human genome diversity project (see Human Genome Diversity and its Value). This project will require a lot of genetic data concerning the Japanese population, in particular a large amount of polymorphism data from several thousand people (see Race: Genetic Aspects).

Several preliminary research projects have already started in Japan. The cDNA project of Caenorhabditis elegans (C. elegans) genome has been conducted by NIG in cooperation with the Sanger Center and Washington University. The B. subtilius and E. coli genome teams have already begun functional analysis of all genes contained in their respective organisms. Many researchers are participating in these genome projects by conducting bioinformatics research and constructing databases. Thus, JHGP has resumed its original goals in a more integrated and coordinated way.

View chapterPurchase book

Bioethics: Examples from the Life Sciences

F. Thiele, in International Encyclopedia of the Social & Behavioral Sciences, 2001

4.2.4 Genetically modified organisms

An area of application for the results of genome research is the manipulation of organisms. Well established is the introduction of genes in bacteria, plants or animals. The transgenic organisms produced employing this method are used either to perform further research on the transferred gene or with the purpose of producing certain rare substances in a bio-factory, e.g., the generation of insulin in transgenic bacteria. Additionally, it is hoped that transgenic organisms will be helpful in improving food-supply in developing countries.

The moral problems of genetically modified organisms arise from possibly deliberate interventions in the 'natural' biodiversity as well as from alleged infringements of animal rights. The destruction of bio-diversity is a controversially discussed problem, which is tightly connected with the development of modern societies. Usually it is presupposed that the description and quantification of biodiversity is scientifically and unambiguously clarified. In contrast to this expectation, neither biology nor other disciplines dealing with corresponding parameterizations, such as economics, have presented uniform concepts which can serve as a basis for measurability and comparability of biodiversity.

View chapterPurchase book

Introduction to Human Genome Computing Via the World Wide Web

Lincoln D. Stein, in Guide to Human Genome Computing (Second Edition), 1998

3.6 Species-Specific Databases

In addition to the large community databases like GenBank, EMBL and GDB, there are hundreds of smaller species-specific databases available on the Web. Although not offering the comprehensive range of the big databases, they are a good source of unfiltered primary data. In addition they may be more timely than the community databases because of the inevitable lag between data production and publication.

Notable sites in this category include:

Whitehead Institute/MIT Center for Genome Research. The data available at this Web site include genome-wide genetic and physical maps of the mouse, physical maps of the human, a genetic map of the rat, and human chromosome 17 DNA sequence.

MGD (Mouse Genome Database). This database, based at the Jackson Laboratory, contains mouse physical and genetic mapping information, DNA sequencing data, and a rich collection of mouse strains and mutants.

Stanford Human Genome Center. This is the site of an ongoing project to produce a high-resolution radiation hybrid map of the human genome.

FlyBase. This Web database, hosted at the University of Indiana, is a repository of maps, reagents, strains and citations for Drosopbila melanogaster.

ACEDB. The ACEDB database stores mapping, sequencing, citation and developmental information on Caenorhabditis elegans and other organisms. The Genome Informatics Group at the University of Maryland maintains a Web site at the URL given below, which provides interfaces both to the C. elegans database and to a variety of plant, fungal and prokaryotic genomes.

SGD (Saccharomyces Genome Database). The Stanford Genome Center hosts the Saccharomyces Genome Database, a repository of everything that is worth knowing about yeast (now including the complete DNA sequence).

TIGR (The Institute for Genome Research). The TIGR site contains partial and complete genomic sequences of a large number of prokaryotic, fungal and protozoan organisms. Its 'Human Gene Index' is a search interface to the large number of human expressed sequences that have been produced by TIGR and other groups.

Washington University Genome Sequencing Center. This is the home of several large-scale genome sequencing projects, including human and mouse EST sequencing, C. elegans genomic sequencing, and human genomics sequencing (primary chromosomes 2 and 7).

The Sanger Centre. The Sanger Centre is another source of extensive DNA sequencing information. Projects include the genomic sequence of C. elegans, and human chromosomes 1, 6, 20, 22 and X. In addition to its sequencing efforts, the Sanger Centre also produces chromosome-specific human radiation hybrid maps.

University of Washington. The University of Washington Genome Center is sequencing human chromosome 7 (in collaboration with Washington University), as well as the human leukocyte antigen (HLA) class I region and the mouse T-cell receptor region.

View chapterPurchase book

Societal Issues

Jules J. Berman Ph.D., M.D., in Principles of Big Data, 2013

Hubris and Hyperbole

Intellectuals can tell themselves anything, sell themselves any bill of goods, which is why they were so often patsies for the ruling classes in nineteenth-century France and England, or twentieth-century Russia and America.

Lillian Hellman

I know lots of scientists; the best of them lack self-confidence. They understand that their data may be flawed, their assumptions may be wrong, their methods might be inappropriate, their conclusions may be unrepeatable, and their most celebrated findings may one day be discredited. The worst scientists and physicians are just the opposite—confident of everything they do, or say, or think.236

The sad fact is that, among scientific disciplines, Big Data is probably the least reliable, providing major opportunities for blunders. Prior chapters covered limitations in measurement, data representation, and methodology. The biases encountered in every Big Data analysis were covered in Chapter 10. Apart from these limitations lies the ever-present dilemma that assertions based on Big Data analyses can sometimes be validated, but they can never be proven true. Confusing validation with proof is a frequently encountered manifestation of overconfidence.

Validation is achieved when a data-related hypothesis provides a correct answer, whenever the hypothesis is tested. It is tempting to infer that if you have tested a hypothesis over and over again, and it always passes your tests, then you've proven that the hypothesis is true. Not so.

If you want to attain proof, you must become a mathematician; mathematics is the branch of science devoted to truth. With math, you can prove that an assertion is true, you can prove that an assertion is false, and you can prove that an assertion cannot be proven to be true or false. Mathematicians have the monopoly on proving things. None of the other sciences have the slightest idea what they're doing when it comes to proof.

In nonmathematical sciences, such as chemistry, biology, medicine, and astronomy, assertions are sometimes demonstrably valid (true when tested), but assertions never attain the level of a mathematical truth (proven that it will always be true, and never false, forever). Nonetheless, we can do a little better than showing that an assertion is simply valid. We can sometimes explain why an assertion ought to be true for every test, now and forever. To do so, an assertion should have an underlying causal theory that is based on interactions of physical phenomena that are accepted as true. For example, F = ma ought to be true because we understand the concepts of mass and acceleration, and we can see why the product of mass and acceleration produce a force. Furthermore, everything about the assertion is testable in a wide variety of settings.

Big Data analysts develop models that are merely descriptive (e.g., predicting the behavior of variables in different settings), without providing explanations in terms of well-understood causal mechanisms. Trends, clusters, classes, recommenders, and so on may appear to be valid over a limited range of observations, but may fail miserably in tests conducted over time with a broader range of data. Big Data analysts must always be prepared to abandon beliefs that are not actually proven.237

Finance has eagerly entered the Big Data realm, predicting economic swings, stock values, buyer preferences, the impact of new technologies, and a variety of market reactions, all based on Big Data analysis. For many financiers, accurate short-term predictions have been followed, in the long run, with absolutely ruinous outcomes. In such cases, the mistake was overconfidence—the false belief that their analyses will always be correct.238

In my own field of concentration—cancer research—there has been a major shift of effort away from small experimental studies toward large clinical trials and so-called high-throughput molecular methods that produce vast arrays of data. This new generation of cancer research costs a great deal in terms of manpower, funding, and the time to complete a study. The funding agencies and the researchers are confident that a Big Data approach will work where other approaches have failed. Such efforts may one day lead to the eradication of cancer—who is to say? In the interim, we have already seen a great deal of time and money wasted on huge, data-intensive efforts that have produced predictions that are unreproducible and no more valuable than a random throw of dice.89,90,224,239,240

Despite the limitations of Big Data, the creators of Big Data cannot restrain their enthusiasm. The following is an announcement from the National Human Genome Research Institute (NHGRI) concerning its own achievements:

In April 2003, NHGRI celebrated the historic culmination of one of the most important scientific projects in history: the sequencing of the human genome. In addition, April 2003 marked the 50th anniversary of another momentous achievement in biology: James Watson and Francis Crick's Nobel Prize winning description of the DNA double helix and to mark these achievements in the history of science and medicine, the NHGRI, the NIH and the DOE held a month-long series of scientific, educational, cultural and celebratory events across the United States.241

In the years following this 2003 announcement, it has become obvious that the genome is much more complex than previously thought, that common human diseases are genetically complex, that the genome operates through mechanisms that cannot be understood by examining DNA sequences, and that much of the medical progress expected from the Human Genome Project will not be forthcoming anytime soon.239,242,243 In a 2011 article, Eric Lander, one of the luminaries of the Human Genome Project, was quoted as saying "anybody who thought in the year 2000 that we'd see cures in 2010 was smoking something."243 Monica Gisler and coworkers have hypothesized that large-scale projects create their own "social bubble," inflating the project beyond any rational measure.244 It is important that Big Data proselytizers, myself included, rein in their enthusiasm.

View chapterPurchase book

Human Genome Project: German Perspective

J. Maurer, H. Lehrach, in International Encyclopedia of the Social & Behavioral Sciences, 2001

3.4 The German Human Genome Project

During the 1980s the tremendous successes of genetic engineering boosted the biotechnology industry in the USA and provided the prerequisite for the idea of the Human Genome Project. While in the US, companies like Amgen turned over more than US $ 1 billion per year, the production of recombinant pharmaceuticals was still hampered in Germany. At the same time, recombinant pharmaceuticals and vaccines had to be imported into Germany. One of the first dedicated centers for genome research in Germany, the Genzentrum in Munich had bulletproof windows installed in fear of an attack. The international Human Genome Project was started in 1990 without any German participation. But increasing economic pressure forced Germany to a rethinking of genome research. After a thorough investigation of the issue by a parliamentary commission, two programs of the Christian Democratic-Liberal coalition started in the mid-nineties. The BioRegio Competition aimed at exploiting the synergy of self-organization of whole regions as centers of excellence for biotechnology and was unexpectedly successful. The German Human Genome Project (DHGP), integrated into the international effort to decipher the human genome, was thematically much more focused. It was strongly industry-oriented and a special infrastructure was to guarantee the effective transfer of the knowledge gained to industry. Initial efforts to raise half of the budget from the German pharmaceutical industry failed. Subsequently, the whole budget of DM 40 million per year was granted by the Federal Ministry of Education and Research. To safeguard the interests of industry, a Patent and Licensing Agency (PLA) was established whose task was to screen the results of the projects granted within the DHGP and to guarantee a patent application if applicable. The patent is granted to the inventor at no cost. In turn, the Industry Association gets the right of first refusal on the patent.

The agreement also gave the Industry Association the right to screen the data produced in the German DNA sequencing centers—a clear infringement of the so-called Bermuda Principles—an agreement between all academic sequencing centers worldwide making all data immediately publicly available without any restrictions. The Industry Association waived their claims only after massive pressure from the international scientific community.

The DHGP has had a profound effect on the German biotechnological industry. Contrary to the expectation that the big pharmaceutical companies would exploit the scientific results, licensing of patents was sparse. Instead, many researchers preferred to start their own companies, a tendency up to then only rarely found in German academia. The success of the DHGP initiative led to Germany exceeding the United Kingdom in the number of biotech companies by the year 2000.

Initial concerns among scientists that funding of genome research would be cut after the change of government to a coalition of Social Democrats and the Green Party were groundless. Not only has the DHGP been maintained, but also a plant genome project—initiated by the prior administration—as well as a microorganism genome project were started, assisted by additional projects on bioinformatics and proteomics. In 2000, a long-expected concept document was released by the Federal Ministry of Education and Research. This paper draws up a long-term perspective for genomic research in Germany and safeguards the existence of central infrastructure units (resource centers) for genomic research. At the same time, boosted by the announcement of the draft version of the human genome, a tremendous increase in funding was approved that eventually brought Germany in line with genomic funding in countries like Japan, UK, and France.

The economic success as well as the prospects of new diagnostic and therapeutic possibilities has very much changed the German attitude towards human genetics. Medical genetics is now much more accepted than it was a few years ago. In contrast, due to concerns about health as well as to the aggressive marketing strategies of certain agrocultural biotech companies, plant genetic engineering has met with deep disapproval in Germany. However, the fatalistic attitude in Germany towards fundamental natural sciences has somewhat changed. The former lack of discussion in society that permitted simplistic arguments to gain predominance has been overcome. The majority of the German intellectual elite has realized that an informed discussion within society is needed to solve the problems stimulated by genome sciences.

View chapterPurchase book

Quality Evaluation of Rice

Yukiharu Ogawa, in Computer Vision Technology for Food Quality Evaluation, 2008

1 Introduction

Rice (Oryza sativa L.) is one of the major commercial cereal grains worldwide, along with wheat and corn. In the order of 628 million tonnes of rice were produced throughout the world in 2005, and the world trade in the commodity that same year was 29.9 million tonnes, as estimated by the FAO (2006). Over 90 percent of rice is produced and consumed in Asia.

Since the mapping of the rice genome began, genetic studies, such as genome research into rice, have progressed. Rice is therefore currently studied in many academic fields, including plant, breeding, crop, and food science. Although the aims of rice studies vary, quality evaluation of the grain as a foodstuff is one of the main goals. Computer vision technology, which is progressing all the time with the continuous development in both hardware and software, can contribute to such quality evaluation by assessing the quality of the rice grains objectively, consistently, and quantitatively.

In this chapter, various techniques and methods for the quality evaluation of rice using computer vision technology are described. Rice research has various aspects, as mentioned above, and the significance of the rice quality differs within each – for example, the quality of rice as a foodstuff is different from that as a raw material. An outline of rice quality is thus described in the next section. Rice as a raw material ("raw rice") and as a prepared foodstuff ("cooked rice") is classified in the following sections and described together with the different evaluation techniques.

View chapterPurchase book

Applications in Data-Intensive Computing

Anuj R. Shah, ... Nino Zuljevic, in Advances in Computers, 2010

3.1.1.1 The Challenge

In much the same way that computer components have undergone continuous significant improvement in performance, the technology for determining the linear sequence of molecular units in genes and proteins has transformed the way in which modern biological research is conducted. At one time, determining the sequence of a single gene or protein required heroic effort and led to an environment where laboratories (by necessity) specialized in studying a single gene or class of genes. Today, things are much different. Sequencing technology has improved to the point where obtaining the complete genome for cultured organisms is within the reach of many single-investigator grants, in terms of both expense and time. Small- and large-scale genome-sequencing centers regularly complete and publish newly sequenced genomes, typically doubling the entire volume of sequenced genes every 18 months (http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html)! This has led to an explosion in the public genome information available and a data-driven revolution in the field of microbial biology. More complex eukaryotic organisms such as plants and animals have correspondingly more complex genomes. Analyzing the sequences of eukaryotes comes with additional complexity because many genes have alternate ways of being assembled from the linear chromosomal sequence, a feature that is not common in microbes.

A new direction of genome research, known as metagenomics, is focused on understanding the genes produced by communities rather than isolated organisms. Metagenomics is evolving quickly, enabled by continued improvement in sequencing technology. It is also driving new kinds of biological investigations that involve interplay between processes and metabolic pathways that span multiple organisms. This is an important key to understanding any biosystem because humans and microbes generally live as symbionts; for instance, humans are hosts for microbes that are essential to our health.

Regardless of whether one is studying relatively simple microbial genomes, more complex eukaryotes, or extremely complex community systems, the availability of sequence data is both a blessing and a curse. On one hand, a vast and rapidly growing resource of annotated sequence data is available to help characterize newly sequenced systems. But the main drawback is that finding true relationships from complex relationships gets harder as the underlying data set grows. Typical full-genome analysis is already beyond the reach of typical desktop systems unless one is willing to wait days or weeks for the results. Multiple full-genome analysis is even further beyond reach. Unfortunately, this often leads biologists to make an unpleasant decision between disregarding some data in the hopes of bringing the hypothesis to a tractable computing time and accepting sequence analysis as the rate-determining step.

As an example, a recent paper from the Venter Institute [65] details analysis of a large geospatially distributed collection of genome community samples. This analysis required months of computing just to compare all the sequences against one another to feed downstream analysis. For most biologists, this level of computing is not available, but the desire and need to operate at this scale exists. If data are available (which in this case it is) and a hypothesis requires one to operate on that data (which often it does), there should be widely accessible tools for performing the analysis. Similar situations have been faced in astrophysics, chemistry, climate modeling, and a host of other scientific applications. The challenge for DIC is to bring large-scale biological analysis within reach of bench biologists without forcing them to specialize in HPC or algorithm implementation.

View chapterPurchase book

Integrated Population Biology and Modeling, Part B

Katabathula Ramachandra Murthy, ... Vinay Varadan, in Handbook of Statistics, 2019

1 Introduction

The pathobiology of complex diseases such as cancer typically involves coordinated dysregulation of multiple genes interacting in intricate ways that are not yet fully understood. Characterizing the complex interplay of cellular processes in cancer using in-depth molecular measurements would enable uncovering the key mechanisms underlying its development and progression. Significant advances in molecular measurement technologies have resulted in a wide range of tumor profiling capabilities ranging from measuring individual molecular characteristics per tumor cell in a given tissue sample, such as genomic copy number alterations using fluorescent in situ hybridization (FISH) and protein expression levels by immunohistochemistry (IHC); to tissue-level expression levels of a few genes using quantitative polymerase chain reaction (qPCR); and ultimately to genome-scale measurements of mutations, copy number alterations, and gene expression levels using next generation sequencing (NGS) of DNA or RNA derived from tissues or cells. Each of these types of measurement technologies are useful in the diagnosis, prognosis, and treatment of cancer, with genome-scale measurement technologies quickly becoming more reliable and cost-effective, thus likely replacing the more focused measurement technologies in both research and clinical settings (Kamalakaran et al., 2013).

These technological advances in genome-scale measurement technologies have enabled large-scale molecular profiling of cancer tissues by translational scientists, an effort exemplified by The Cancer Genome Atlas (TCGA) consortium that was initiated and funded by the National Cancer Institute and National Human Genome Research Institute. Over the past decade the TCGA has generated genome-scale molecular profiles of over 11, 000 cancer samples, spanning 33 cancer types, totaling over 2.5 petabytes of data. Such large-scale profiling efforts have resulted in a paradigm-shift in our understanding of cancer, posing both new challenges and opportunities for the treatment of this complex and aggressive set of diseases. On the one hand, large-scale genomic profiling of cancers has revealed that tumors typically harbor irregularities across genomic (mutations and structural variations), transcriptomic (differential expression of coding and noncoding RNAs), and epigenetic (DNA methylation and histone modifications) modalities. These multiomic abnormalities have revealed substantial molecular heterogeneity resulting in multiple molecularly-defined subtypes even within a single cancer type such as breast cancer, each of which exhibit different clinical outcomes and response rates to frontline therapies (Lehmann et al., 2011, 2016; Morganella et al., 2016; Nik-Zainal et al., 2016; Perou et al., 2010; Sanchez-Garcia et al., 2014; Varadan et al., 2016a,b; Yates et al., 2017). Thus, one of the principal challenges of this postgenomic era of cancer research is deciphering how diverse sets of molecular alterations jointly contribute to an individual tumors phenotype in order to enable development of personalized strategies in the clinic. This challenge has spawned an entire research field in bioinformatics focused on developing methodologies for integrative analysis of multiomics cancer profiles (Creixell et al., 2015; Dimitrova et al., 2017; Greenblum et al., 2011; Kristensen et al., 2012; Osmanbeyoglu et al., 2014; Razi et al., 2015; Sedgewick et al., 2013; Tarca et al., 2008; Vandin et al., 2011; Varadan et al., 2012; Vaske et al., 2010).

On the other hand, pan-cancer integrative analyses, which involves the analysis of molecular profiles across multiple cancer types, have identified commonalities in oncogenic processes in cancers arising in completely different organs (Ciriello et al., 2013; Ding et al., 2018). The recent Pan-Cancer Atlas set of publications provides new opportunities to identify molecularly targeted treatment strategies that can be applied to different cancer types, irrespective of their tissue of origin (Kruger, 2018). Indeed, the clinical implication of this paradigm-shifting view of cancer is being explored in two large clinical trials at the National Cancer Institute, known as NCI-MATCH and NCI-MPACT. Patients in the NCI-MATCH trial are assigned to receive a specific targeted treatment based on molecular profiles of their individual tumor samples, irrespective of the tissue of origin of the cancer (Coyne et al., 2017; Do et al., 2015). If successful, these trials will usher in a new paradigm of precision cancer medicine, where molecular biomarkers will be employed in the clinic to assign patients to specific interventions.

The above challenge and opportunity thus strongly motivate the development of new computational approaches to identify cancer biomarkers from molecular profiling data of patient tumors. Biomarker discovery specific to cancer has been an area of significant interest in the statistical and informatics communities, given the unique challenges of the relatively small numbers of patients (<102) when compared to number of diverse molecular features (≫104) measured per patient. This chapter is motivated by this dual-challenge of integrating diverse molecular data (multiomics measurements) in relatively small patient populations, to discover the subset of biologically or clinically relevant molecular features (biomarkers). In machine learning or data science literature, mining informative features is an active area of research. Most of the literature is focused on either feature transformation to generated new set of features or selecting feature subsets from the original features. The true meaning of features get modified while transforming them and selection of features requires exhaustive search of feature space due to the high correlations (Chandrashekar and Sahin, 2014; Cunningham and Ghahramani, 2015; Sreevani and Murthy, 2017; Sreevani et al., 2018). Whereas, latent theory provides a flexibility of independence to observable feature by modeling an unobservably (latent) variables. Thus, we focus on the application of a well-developed area of statistical analysis, item response theory (IRT), also known as latent trait theory, to this challenge of multiomics biomarker discovery in cancer.

Goal of the chapter is to model different biological aspects of cancers using biomarker information of patients and IRT models on the latent trait. Moreover, identifying informative biomarkers for cancers in a clinical trail is the primary goal. Firstly, different item response theory models are reviewed in perspective of clinical study. And, then a criterion function called biomarker (item) information function has been discussed which describes properties of biomarkers on latent scale. Subsequently followed by Bayesian framework for IRT models and simulated experimental study for biomarker selection.

View chapterPurchase book

'Big data', Hadoop and cloud computing in genomics

Aisling O'Driscoll, ... Roy D. Sleator, in Journal of Biomedical Informatics, 2013

4 Challenges

It must be emphasised that big data technologies are very much in their infancy and that although powerful, have a long way to go. Programming Hadoop requires a high level of Java expertise to develop parallelised programs. Efforts have been made to simplify this process, even in the technology sector, with developed software libraries such as Hive to add a "SQL" like interface that will generate parallelised Hadoop jobs in the background. Python streaming has also been made available to circumvent complex Java programming by wrapping it in Python, a more lightweight scripting language. Another point of consideration is that Hadoop MapReduce is designed by default to treat each line as an individual record. As many standard sequence formats involve multiple lines per sequence it is necessary to manipulate the data into one line formats, or to program custom Hadoop input formats and record readers – a less than trivial task.

Furthermore, there is a current trend towards further developing analytics and visualisation technologies on top of the Hadoop platform, to enable better standardisation of reporting and summarisation of results. This is a problem which is not adequately addressed in the technology sector and is vital if the technology is to be widely embraced by diverse Industry sectors. Hadoop, is currently still very much a "behind the scenes" technology with no front end visualisation, powerful only if in the right hands and still difficult to set up, use and maintain. There are concerted efforts being made towards adding developer friendly management interfaces or GUIs on top of Hadoop systems to move away from shell or command line interfaces. Recently, Schoenherr et al. [54] presented Cloudgene for this precise purpose. Cloudgene provides a standardised graphical execution environment for currently available and future MapReduce programs, which can all be integrated by using its plug-in interface.

There are also drawbacks associated with the utilisation of cloud computing. One of the most significant challenges, given the scale of the genomic data being generated, is that transmitting such data over the Internet or any other form of communication media takes prolonged periods of time, sometimes even in the region of weeks. Thus, the bottleneck is the rate of data transfer, i.e., getting data into and out of the cloud. As outlined in [55], in an interview with Vivien Bonazzi, program director for computational biology and bioinformatics at the National Human Genome Research Institute (NHGRI), "putting data into a cloud cluster by way of the Internet can take many hours, even days, so cloud customers often resort to the "sneaker net": overnight shipment of data-laden hard drives". In fact, BGI, one of the world's leading genomics research institutes produces 200 genomes a day, with disks transported manually via FedEx [56]. AWS are actively trying to overcome this by introducing a multi-part upload and companies such as Aspera are also designing a new layer to operate on top of the TCP (Transport Control Protocol) transport layer protocol in an attempt to alleviate this issue [57]. This high speed file transfer (expected to be between 10 and 100 times faster than traditional FTP and HTTP approaches) has already been integrated into the BGI EasyGenomics SaaS solution showcased at the Bio-IT World Conference & Expo in 2012. BGI also integrated EasyGenomics with the Hadoop platform. This is notable as Hadoop and other scale out big data technologies exhibit a distinct advantage over traditional data management approaches in that the computation is moved to the data. By utilising local commodity hardware, data is distributed across a cluster of machines each utilising local processing, storage and memory and is processed in parallel, negating the need to transfer the data across the network from its storage location to be processed, as is typically the case with traditional HPC solutions. Furthermore, the aforementioned upload challenges are typically only faced by the large sequencing centres or represent a once off challenge. In contrast, as recently noted by Andreas Sundquist, CEO DNAnexus, the upload of sequence data, produced in real time from a single modern sequencing instrument requires a lower bitrate than streaming a movie over the Internet [58].

As recently noted by Schadt [59], the ability to protect medical and genomics data in the era of big data and a changing privacy landscape is a growing challenge. While cloud computing is championed as a method for handling such big data sets, it's perceived insecurity is viewed by many as a key inhibitor to its widespread adoption in the commercial life sciences sector. Indeed, this may explain why its employment has primarily been adopted by research and academic labs. However it should be noted at the outset that in many cases cloud solutions can provide equivalent, if not improved, security depending on the local security policy employed. Clinical sequencing, however, must meet rigorous regulatory requirements, primarily from the Health Insurance Portability and Accounting Act of 1996 (HIPAA) and thus cloud computing is being cautiously considered in such cases. HIPAA standards require compliance across service, platform and infrastructure layers, i.e., across IaaS/PaaS and SaaS. As this is difficult to enforce and validate in a public cloud, with third party contributors, Amazon is not HIPAA compliant. Amazon has however released a paper that will allow customers to develop healthcare applications that comply with a subset of HIPAA's regulations [60]. Hybrid clouds are thus considered a more secure approach with hybrid FDA and HIPAA compliant clouds used as part of the collaboration discussed in the previous section between Dell and TGen to support the world's first personalised medicine trial for paediatric cancer [61] and commercial solutions such as those provided by GenomeQuest and DNANexus. However, fundamental aspects of data security will need to be addressed before widespread adoption of cloud-based clinical sequencing can occur. Some of the key issues include encryption mechanisms (particularly key management), the vulnerabilities of Internet based customer access interfaces, replication in the case of disaster recovery, along with inadvertent data access via incorrect data 'deletion' i.e. reassignment of virtual resources allowing customer access to other customers' 'deleted data'. This will not be an overnight solution and with increasingly advanced decryption and de-anonymisation techniques, the privacy of "anonymised" sequence data or Electronic Health Records (HER) may be extremely difficult to definitively guarantee. In the case of highly sensitive data, when all the technical precautions are provided, the weakest link in the chain may, as has traditionally been the case, be the human one. Nonetheless, the increasing impetus to utilise such technologies in order to exploit their economic benefits has highlighted the need for increased legislation in this area [62]. Data tenancy is another perceived challenge, particularly with public cloud usage i.e., availability of data should a commercial cloud service provider cease trading. This was evidenced when Google discontinued Google Health in early 2012, giving users a year within which to make alternative arrangements for their data. Furthermore, should this occur or if another motivating factor causes a user to decide to move their data to another provider, the ease with which this transition can occur largely depends on the interoperability of the initial cloud service.

Unfortunately, most cloud infrastructures provide very little capability on data, application, and service interoperability. This makes it difficult for a customer to migrate from one provider to another, or move data and services back to an in-house IT environment. Finally, a further challenge relates to data privacy legislation (e.g., data in the EU cannot be stored in a US region) as well as legal ownership and responsibility pertaining to data stored between international zones (e.g., the 1000 Genome project exists only in US zone, not the EU zone) [63].

Read full articleView PDF

Skip to Main content

Genome Engineering

Genome engineering took advantage of the CRISPR system, together with its programmable nuclease Cas9, for efficiently editing DNA or RNA at precise locations in a custom way.

From: Epigenetics and Regeneration, 2019

Related terms:

Nuclease

CRISPR

Guide RNA

Zinc Finger Nuclease

Nested Gene

Genome Editing

Cas9

Mutation

Escherichia coli

View all Topics

Genome engineering in insects: focus on the CRISPR/Cas9 system

V. Edwin Hillary, ... S. Ignacimuthu, in Genome Engineering via CRISPR-Cas9 System, 2020

Abstract

Genome engineering is a precise tool used to alter the genome of desired organism. Zinc finger nuclease (ZFN), transcription activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeats (CRISPR), and the CRISPR-associated RNA guided endonuclease Cas9 (CRISPR/Cas9) are the major genome engineering tools used in these days. CRISPR/Cas9 system has redeemed the precise genome engineering in different species including insects. This chapter covers the details on genome engineering studies reported in various insects including mosquitoes, butterflies, silkworm and fruit fly with a focus on CRISPR/Cas9 system. Many studies have been reported on application of ZFN, TALEN and CRISPR/Cas9 in insects. In recent years, many scientists have adopted CRISPR/Cas9 system for insect genome modification due to its affordability and quick designing of the constructs. We also discuss the details and applications of gene drive. Further studies with CRISPR/Cas9in insects will help researchers to find an effective strategy to combat the vector borne diseases spread by insects like mosquitoes.

View chapterPurchase book

Genome Engineering for Therapeutic Applications

Pratiksha I. Thakore, Charles A. Gersbach, in Translating Gene Therapy to the Clinic, 2015

3.1 Introduction

Genome engineering technologies based on synthetic nucleases and transcription factors enable the targeted modification of the sequence and expression of genes. These engineered nucleases and transcription factors typically consist of a DNA-binding domain linked to an effector module. Zinc finger proteins (ZFPs) and transcription activator-like effector (TALE) DNA-binding domains were discovered in nature and systems have been developed to engineer synthetic versions of these proteins with the potential to recognize any nucleotide sequence in the genome. More recently, RNA-guided targeting of DNA sequences through the clustered regularly interspaced short palindromic repeats (CRISPR)/Cas system has further simplified custom genome engineering, obviating the need for the complex protein engineering necessary for the ZFP- and TALE-based systems and therefore enabling widespread use of genome engineering. Effector modules that can be attached to targeted DNA-binding domains include endonuclease catalytic domains, transcriptional activators, and transcriptional repressors. With the potential to control the expression of any gene or modify any sequence in the human genome, these proteins have promising therapeutic potential, with zinc finger proteins in phase II clinical trials and multiple biotechnology companies emerging with a focus on genome engineering-based medicine. Synthetic nucleases have been used to guide targeted therapeutic gene addition, correct pathogenic mutations, and knock out disease-associated genes. Synthetic transcription factors have the potential to activate cellular reprogramming for regenerative medicine and modulate aberrant gene expression to treat cancer and other genetic diseases. This chapter reviews each of these technologies and their therapeutic applications (Table 3.1).

Table 3.1. Examples of Preclinical Development of Genome Engineering for Therapeutic Applications

Type of ModificationTarget GeneTherapeutic ApplicationDNA-Targeting PlatformReferenceGene additionROSA26Safe-harbor siteZinc finger nuclease1AAVS1Safe-harbor insertion of gp91phox minigene for X-linked chronic granulomatous diseaseZinc finger nuclease2Gene correctionIL2RγHDR-based correction in primary T cells or CD34+ stem cells for X-linked SCIDZinc finger nuclease3,4β-globinHDR-based correction in iPSCs for sickle cell anemiaZinc finger nuclease5,6CFTRHDR-based correction in intestinal stem cells for cystic fibrosisCRISPR/Cas97SERPINA1HDR-based correction in hepatic cells for α1-antitrypsin deficiencyZinc finger nuclease8Factor IXIn vivo correction in the liver for hemophilia BZinc finger nuclease9,10DystrophinReading frame restoration by NHEJ in primary myoblasts for DMDTALE nuclease11CrygcIn vivo correction of dominant negative cataract disorderCRISPR/Cas912FahIn vivo correction of hereditary tyrosinemiaCRISPR/Cas913Gene disruptionCCR5Gene knockout in T cells for resistance to HIV infectionZinc finger nuclease14CCR5Gene knockout in iPSCs for resistance to HIV infectionCRISPR/Cas915mtDNADisrupt mtDNA expressing mutations for Leber's hereditary optic neuropathy and dystoniaTALE nuclease16T-cell receptor, α- and β-chainsDisrupt native T-cell receptor gene for cancer immunotherapyZinc finger nucleases17,18MyostatinIncrease muscle mass and fiber size for treatment of sarcopenia and DMDTALE nuclease19HLA-A2Disrupted HLA genes in T cells and embryonic stem cells to prevent immune rejection for allogeneic cell therapyZinc finger nuclease20PCSK9In vivo gene knockout to increase LDL receptor levels and decrease cholesterol levelsCRISPR/Cas921Gene activationMASPINActivate tumor suppressor to suppress breast cancer tumor growthZFP-VP64 fusion22–24γ-globinUpregulate compensatory embryonic globin to treat sickle cell anemia and β-thalassemiaZFP-VP64, CRISPR/dCas9-VP6425–27UtrophinActivate compensatory extracellular matrix protein to treat DMDZFP-VP1628,29GDNFTreat neural cell degeneration in Parkinson's diseaseZFP-VP6430VEGFActivate all isoforms of VEGF to treat ischemia in diabetic neuropathyZFP-VP6431–33Gene repressionBCR-ABLInhibit oncogene expressionZFP, without effector34httRepress mutant htt expression to treat Huntington's diseaseZFP-KRAB35

View chapterPurchase book

The Use of CRISPR/Cas9, ZFNs, and TALENs in Generating Site-Specific Genome Alterations

Vikram Pattanayak, ... David R. Liu, in Methods in Enzymology, 2014

3 Conclusion

Genome engineering in the last few years has become more facile through the use of programmable site-specific nucleases such as TALENs and Cas9, which can be designed to target nearly any DNA sequence. As the use of ZFNs, TALENs, and Cas9 in research and clinical settings continues to grow, efforts to reveal in depth the DNA cleavage specificity of programmable nucleases will become increasingly important. Efforts to characterize programmable nuclease specificity have ranged from discrete target-site assays to in vitro selections to genome-wide selections, all of which have been applied recently to study TALEN and Cas9 specificity. The findings from these methods will continue to deepen our understanding of the basis of the DNA cleavage specificity of these important proteins, inform the development of programmable nucleases with improved specificity, and perhaps eventually enable the broad application of these or other programmable nucleases to treat human genetic diseases.

View chapterPurchase book

The Use of CRISPR/Cas9, ZFNs, and TALENs in Generating Site-Specific Genome Alterations

Minjung Song, ... Hyongbum Kim, in Methods in Enzymology, 2014

1 Introduction

Genome engineering in human cells is of great value in research, medicine, and biotechnology. In research, one of the best ways to determine the function of a human gene or genetic element is to compare the phenotype of human cells containing a mutation in the gene or element of interest with that of isogenic normal human cells. This process is increasingly important given that a growing number of researchers are using human pluripotent stem cells as disease models to investigate disease pathophysiology and screen therapeutic drugs in vitro (Colman & Dreesen, 2009; Saha & Jaenisch, 2009). Furthermore, if reporter genes or peptide tags are inserted into endogenous genes through genome engineering, it becomes possible to monitor or trace those genes. In medicine, many genetic diseases could be prevented or treated if the genetic mutations that cause the disease were corrected, as has been done in cell or animal models (Li et al., 2011; Osborn et al., 2013; Schwank et al., 2013; Voit, Hendel, Pruett-Miller, & Porteus, 2014; Yin et al., 2014). This targeted genetic modification can potentially also be used to treat nongenetic diseases such as human immunodeficiency virus (HIV) infection, which has been tested in human patients (Holt et al., 2010; Tebas et al., 2014). In biotechnology, targeted genetic modification of human cells can also contribute to technical developments. For example, when human cells such as Chinese hamster ovary cells are used to produce specific proteins, genome engineering can improve yields and enhance the efficiency of this process.

Conventional gene targeting approaches based on homologous recombination (HR), which occurs in nature when sperms and eggs are generated, can be used to achieve targeted genetic modification in human cells (Smithies, Gregg, Boggs, Koralewski, & Kucherlapati, 1985; Song, Schwartz, Maeda, Smithies, & Kucherlapati, 1987). However, the efficiency of HR is extremely low, necessitating elaborate positive and negative selection to obtain cells that contain the desired modification.

Double-strand breaks (DSBs) at the target site can increase the efficiency of HR by at least two orders of magnitude (Rouet, Smih, & Jasin, 1994). Furthermore, error-prone repair of these DSBs through nonhomologous end joining (NHEJ) can lead to targeted mutagenesis (Bibikova, Golic, Golic, & Carroll, 2002). DSBs at specific genomic loci can be generated by specific sequence-recognizing programmable nucleases, which include zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and RNA-guided engineered nucleases (RGENs) (Kim & Kim, 2014).

In this chapter, we will first briefly review the structure of the human genome. We will then describe the three programmable nucleases, i.e., ZFNs, TALENs, and RGENs, and their applications in human cells, including their potential utilization for the treatment of both genetic and nongenetic diseases. We will also review various methods for delivering programmable nuclease into human cells as well as techniques for improving the efficiency of the editing process by using nickases or surrogate reporters in human cells.

View chapterPurchase book

Modeling Cancer Using CRISPR-Cas9 Technology

Sandra Rodriguez-Perales, ... Raul Torres-Ruiz, in Animal Models for the Study of Human Disease (Second Edition), 2017

1.3 Genome Engineering Technologies

Genome engineering technologies were first developed some decades ago. Using this approach, we can modify precise DNA sequences to study gene function and regulation of the genes involved in cancer through the generation of in vitro and in vivo models. As mentioned in the previous Section 1.2, traditional genome-engineered cancer models were generated using transgenes or HR in murine embryonic stem cells (mESCs) (Van Dyke and Jacks, 2002). In the early 1980s, specific genes were introduced into mESCs, which were then implanted into wild-type blastocysts, thus enabling transmission of genome modifications into the mouse germline to generate transgenic mouse models (Brinster et al., 1981; Harbers et al., 1981; Wagner et al., 1981). The first transgenic cancer models appeared in 1984, when brain tumors were reproduced by delivering the SV40 T-antigen viral oncogene into mouse eggs (Brinster et al., 1984). The second type of induced tumors were tumors resembling human breast cancer, which were created by replacing the mouse Myc gene promoter with a hormonally inducible mouse mammary tumor virus promoter (Stewart et al., 1984). In 1986, Capecchi and Smithies, the pioneers of gene modification, developed the technology to modify the genome of mESCs by HR (Smithies et al., 1985; Thomas and Capecchi, 1986; Thomas et al., 1986). This development paved the way for various genome-engineered mouse models (GEMMs) of cancer by inducing precise alterations, leading to gain-of-function (GOF) mutations in oncogenes, and loss-of-function (LOF) mutations in TSGs. The generation of transgenic mice by HR was limited by the need for complex targeting, low efficiency, and the time required for the generation of a mouse with the desired modification (Frese and Tuveson, 2007).

Subsequent HR technology merged with site-specific recombinases (Cre and Flp), resulting in the generation of conditional alleles of numerous cancer genes (Orban et al., 1992; Shibata et al., 1997; Smith et al., 1995). Cre recombinase recognizes two pairs of loxP sites, whereas Flp recognizes FRT sites, thus leading to reciprocal recombination of specific DNA sequences. On the other hand, numerous transgenic mouse models have been generated by overexpression of cDNA or gene knockdown by RNA interference (RNAi). RNAi technology was used widely for over a decade because it is an easy, rapid, and inexpensive approach to silencing genes (Tavernarakis et al., 2000). However, it is limited by its potential aberrant and artifactual effects, which in turn generate off-target effects (Jackson et al., 2003). Gene knockdown is usually partial and temporary.

Methods that artificially increased gene targeting were developed. These methods were based on the observation that the occurrence of a double-strand break (DSB) in the targeted region dramatically increases gene-targeting efficiency (Jasin, 1996; Rouet et al., 1994). First, in the mid-1990s, restriction endonuclease I-SceI from Saccharomyces cerevisiae was used to induce breaks, thus demonstrating that the generation of controlled DSBs increases targeted HR (Choulika et al., 1995). Subsequently, tools for genome engineering based on site-specific endonucleases were developed; these included zinc-finger nucleases (ZFN) (Bibikova et al., 2003), transcriptor activator-like effector nucleases (TALENs) (Christian et al., 2010), and clustered regularly interspaced short palindromic repeats (CRISPRs) (Jinek et al., 2012). All three methods, which are described in detail later, have revolutionized the study of cancer biology, especially cancer modeling (Torres-Ruiz and Rodriguez Perales, 2015). The use of ZFN and TALEN technologies is limited owing to their problematic design and elevated cost; however, the development of CRISPR systems offers a rapid and easy method for modification of single or multiple loci that overcomes the limitations of previous genome engineering tools, thus facilitating the rapid generation of cancer models.

View chapterPurchase book

Viruses as Targets for Biotechnology

Paula Tennant, Gustavo Fermin, in Viruses, 2018

CRISPR/Cas9 and Other Targets for Engineering Virus Resistance

Genome-engineering strategies have recently emerged as promising tools for developing virus resistance in eukaryotic species. Among these genome engineering technologies, the CRISPR (clustered regularly interspaced palindromic repeats)/CRISPR-associated 9 (CRISPR/Cas9) system has received special interest because of its simplicity, efficiency, and reproducibility. CRISPR/Cas9 is a prokaryotic molecular immunity system against invading nucleic acids (through horizontal gene transfer) and bacteriophages (Chapter 10: Host–Virus Interactions: Battles Between Viruses and Their Hosts). Once the underlying mechanisms of CRISPR/Cas9 became clear, biologists realized that the system could provide an efficient means of editing genomes. Only two components—the Cas9 protein and the short guide RNA (sgRNA) molecule—are required for this engineered system to function. Cas9 is a nuclease that creates a double strand break (DSB) at a specific genomic location, and the sgRNA, that can be programmed to identify any gene sequence, functions as the homing device for directing the Cas9 nuclease to the particular target sequence. In 2012, the system was developed for use as a gene editing tool. That is, the CRISPRCas9 system, derived from Streptococcus pyogenes, was modified to recognize specific sequences in the genome and cut the targeted DNA to generate DSBs, which could then stimulate the host cell's genome editing through recombination mechanisms, either nonhomologous end joining (NHEJ) or homology-directed repair (HDR) mechanisms. NHEJ can introduce small insertions or deletions (indels) that cause frameshift mutations and the disruption of gene function, while HDR can be engineered for target gene correction, gene replacement, and gene knock-in effects. Recent studies have used CRISPR/Cas9 to engineer virus resistance in plant and animal hosts, by (1) directly targeting and cleaving the viral genome, (2) disrupting a provirus inserted in the host genome, or (3) editing host genes to introduce viral immunity.

Two strategies have emerged as promising tools to introduce virus resistance into plant species: one targeting the virus genome and another that generates resistance by targeting and modifying plant genes encoding host factors required for virus replication. The former example was demonstrated with geminiviruses. It was shown that N. benthamiana plants expressing the CRISPR/Cas9 machinery exhibited resistance against Tomato yellow leaf curl virus, Beet curly top virus, and Merremia mosaic virus. These plants displayed considerably reduced viral titers, which abolished or significantly reduced disease symptoms. It was also shown that one sgRNA targeting the Bean yellow dwarf virus (BeYDV) genome could confer plant resistance without cleavage activity, suggesting that catalytically inactive Cas9 could be used to mediate virus interference, thereby eliminating concerns of off-target activities in the plant genome. The second approach, that off targeting a plant gene, was demonstrated in a potyvirus-cucumber host system. Translation initiation like factors, eIF4E and eIF(iso)4E, are known to be directly involved in the infection of RNA viruses. These targets were mutated with the CRISPR/Cas9 system in order to develop cucumber plants with resistance to Cucumber vein yellowing virus, Zucchini yellow mosaic virus and Papaya ringspot virus-type W. Disruption of the eIF4E gene in cucumber by CRISPR/Cas9 and the development of virus-resistant plants proved that translation initiation factors are prime candidates for host genes that can be targeted; but technically any host gene encoding a factor that the virus requires is a potential target for modification. One major advantage of targeting and modifying host factors is that CRISPR/Cas9 can be introduced as transgenes to create the genome edits and subsequently removed in later generations by backcrossing to give virus-resistant plants free of genetic modification. Alternatively, the Cas9 protein and the sgRNA can be introduced directly into cells as a ribonucleoprotein complex to avoid the incorporation of transgenes into the genome. The resulting plants would therefore be indistinguishable from plants carrying naturally occurring alleles or plants generated by random mutagenesis, which may make them exempt from current GMO regulations. The debate on this matter, the regulation of CRISPR/Cas9-derived organisms, continues.

CRISPR/Cas9 has found potential applications to human virus diseases. In HIV, for example, the possibility that CRISPR/Cas9 could be used to inactivate or even delete proviral DNA from HIV-1 infected cells was examined. In some studies, the cofactor CCR5 was targeted in pluripotent stem cells and hematopoietic stem cells. HIV-1's entry is mediated by sequential binding of its surface glycoprotein to the cellular receptor CD4 and then a chemokine receptor, CCR5. A naturally occurring genetic mutation, known as CCR5Δ32, was determined responsible for HIV resistance. The mutation results in a smaller protein that no longer sits on the cell surface and as a result, CCR5Δ32 hampers HIV's ability to infiltrate immune cells. Using CRISPR Cas9, Δ32 mutations precisely matching the naturally occurring homozygous CCR5Δ32 genotype were generated. The monocytes/macrophages derived from CCR5Δ32 mutation pluripotent stem cells were resistant to HIV-1 infection. In other studies, the CRISPR-Cas9 system was reportedly useful for editing the HIV genome integrated into host cell genome. Several labs have designed sgRNAs to program Cas9 to cleave different regions of HIV-1 DNA that include either essential viral genes or the viral long terminal repeats (LTRs). Suppression of HIV-1 production and infection were observed in different cell types including latently infected CD4+ T cells, primary CD4+ T cells and induced human pluripotent stem cells. The extreme variability and the high evolution rate of HIV-1, however, may warrant programming of Cas9 with multiple sgRNAs that target conserved viral DNA regions in order to avoid virus escape. It is possible that CRISPR/Cas9 could be combined with antiretroviral agents, Highly Active Anti-Retroviral Therapy (HAART) to clear latently infected cells.

A similar strategy proved successful in generating resistance against the virus Porcine reproductive and respiratory syndrome virus (PRRSV). PRRSV, an enveloped, ssRNA(+) virus belonging to the Arteriviridae family in the order Nidovirales, causes one of the most important infectious diseases of pigs worldwide. Porcine Reproductive and Respiratory Syndrome (PRRS) manifests differently in pigs of all ages, but primarily causes late-term abortions and stillbirths in sows and respiratory disease in piglets. As with the HIV-1 CCR5 strategy, the CRISPR-Cas9 system was used to introduce a deletion in a cell surface receptor and thereby hampering the virus' ability to infiltrate immune cells. PRRSV has a highly restricted tropism for cells of the monocyte-macrophage lineage. Its entry via fusion with the host cell membrane is mediated by the receptor CD163, in particular, domain 5 of CD163. Using CRISPR/Cas9 gene editing, pig zygotes with a deletion in the CD163 domain 5 were generated. The deletion rendered cells resistant to PRRSV.

Good efficacy was also reported with studies using CRISPR/Cas technologies to disable replication of DNA herpesviruses like Herpes simplex virus type 1 (Human alphaherpesvirus 1, HHV 1), Epstein-Barr virus (Human gammaherpesvirus 4, HHV 4), and Human cytomegalovirus (Human betaherpesvirus 5, HHV 5). Upon infection, the linear viral DNA of these viruses is delivered to the cell's nucleus, where it circularizes to form a viral episome. Depending on several factors, replication can proceed either to a productive infection or to a state of latency. In either case, the viral genome is maintained as extrachromosomal circular DNA. Using sgRNAs, CRISPR/Cas9 was directed to cut virus DNA, whether actively replicating or latent in the host cell, at one or more sites and induce mutations that would cripple virus replication. Although active replication of all three viruses was abolished, the latent genome was cleared only in host cells challenged with EBV. The potent antiviral activities will need to be tested in animal studies, and eventually humans before any application as future therapy.

Targeting RNA viruses using a similar approach was demonstrated with Hepatitis C virus (Hepacivirus C, HCV) and a Cas9 isolated from Francisella tularensis subsp. novicida. Cas9 from Francisella tularensis subsp. novicida (FnCas9) targets bacterial mRNA. In this investigation, FnCas9 was retargeted to HCV's ssRNA(+) genome in order to block both viral translation and replication. An RNA-targeting guide RNA complementary to a portion of the highly conserved HCV 5′ untranslated region (UTR) was developed. Vectors encoding either this sgRNA or FnCas9 were transfected into human hepatocellular carcinoma cells and subsequently infected with HCV. Expression of the 5′ UTR-targeting sgRNA and FnCas9 together reduced the levels of viral proteins. Applications to other ssRNA(+) viruses such as flaviviruses, enteroviruses, and rhinoviruses as well as ssRNA(−) viruses, like filoviruses, paramyxoviruses, and orthomyxoviruses are anticipated. Because FnCas9 can target both negative- and positive-sense strands of RNA, it is likely that the FnCas9:sgRNA machinery will be used to target diverse viruses. The replication cycles of viruses with both RNA and DNA genomes require an RNA stage (generated during transcription, replication, or both). Taken together, the field of the CRISPR/Cas9 system holds much promise for future therapies against virus diseases. While there are many examples of the efficacious use of CRISPR/Cas9 in cell culture, application to the clinic or agricultural field will require methods of safe and efficient delivery.

View chapterPurchase book

Genetic Engineering for Plant Transgenesis

Surender Khatodia, S.M. Paul Khurana, in Omics Technologies and Bio-Engineering, 2018

5.4 Chloroplast Genome Engineering for Pharmaceuticals

Chloroplast genome engineering has led to stable integration and expression of transgenes to express pharmaceutical proteins, antibiotics, vaccine antigens, and industrial enzymes. There are several advantages of transforming the chloroplast over the nuclear genome like high levels of expression, multigene engineering, gene containment via maternal inheritance, and subcellular compartmentalization. Chloroplast genome-engineering for PMF making up to 70% of total leaf protein and the ability to express in edible leaves permits oral delivery and significantly reduces the production costs (Jin and Daniell, 2015). In the chloroplast engineering technology, transgenes are inserted into the chloroplast genome by site-specific homologous recombination that eliminates the gene silencing, positional effects, and pleiotropic effects in the transgenic lines (Verma et al., 2008). Expression of antigens by chloroplast genome in leaves also allows complete transgene containment, high gene expression levels, and facilitates several important posttranslational modifications (Jin and Daniell, 2015).

Extensive optimization was undertaken to develop a reproducible expression system utilizing species-specific chloroplast vectors, endogenous regulatory sequences, and optimal organogenesis/hormone concentrations to directly generate transplastomic lines without callus induction to make Chloroplast Bioreactors for pharmaceuticals (Chan and Daniell, 2015; Fig. 5.7). Lettuce serves as the only reproducible transplastomic system for oral delivery of vaccines and pharmaceuticals. More than 40 pharmaceuticals and vaccine antigens have been expressed via the chloroplast genome (Jin and Daniell, 2015; Table 5.3).

Sign in to download full-size image

Figure 5.7. Chloroplast transformation using particle gun bombardment of chloroplast vectors is followed by two to three rounds of antibiotic selection and subsequent regeneration of homoplasmic transformants (Chan and Daniell, 2015).

Table 5.3. Chloroplast Bioreactors for Functional Pharmaceuticals and Vaccine Antigens (Jin and Daniell, 2015)

Pharmaceutical/vaccine antigenExpression systemExpression levelFunctional evaluationReferencesCTB–AMA1 (malarial vaccine antigen apical membrane antigen-1)

Lettuce

Tobacco

7.3% TSP

13.2% TSP

Long-term dual immunity against two major infectious diseases: cholera and malariaDavoodi-Semiromi et al. (2010)CTB–MSP1 (malarial vaccine antigen merozoite surface protein-1)

Lettuce

Tobacco

6.1% TSP

10.1 TSP

Long-term dual immunity against two major infectious diseases: cholera and malariaDavoodi-Semiromi et al. (2010)EDA (extra domain A-fibronectin)Tobacco2.0% TSPRetains the proinflammatory properties of the EDA produced in Escherichia coliFarran et al. (2010)Parvovirus immunogenic peptide 2L21 fused to a tetramerization domainTobacco6% TSPImmunogenic response in miceOrtigosa et al. (2010)Immunogenic fusion protein F1-V from Yersinia pestisTobacco14.8% TSPOral delivery of F1-V protected 88% of mice against aerosolized Y. pestis; F1-V injections protected only 33% and all control challenged mice died. Oral boosters conferred protective immunity against plagueArlen et al. (2008)Coagulation factor IXTobacco3.8% TSPPrevents inhibitor formation and fatal anaphylaxis in hemophilia B miceVerma et al. (2010)BACE (human β-site APP cleaving enzyme)Tobacco2.0% TSPImmunogenic response against the BACE antigen in miceYoum et al. (2010)Human papillomavirus L1 proteinTobacco21.5% TSPConfirmed the formation of capsomeresWaheed et al. (2011a,b)ProinsulinTobacco47% TSPOral delivery of proinsulin in plant cells or injectable delivery into mice showed reduced blood glucose levelsBoyhan and Daniell (2011)PA (dIV) (domain IV of Bacillus anthracis protective antigen)Tobacco5.3% TSPDemonstrates protective immunity in mice against anthraxGorantala et al. (2011)Human thioredoxin 1 proteinLettuce1% TSPProtected mouse insulinoma line 6 cells from hydrogen peroxideLim et al. (2011)Thioredoxins–human serum albumin fusionsTobacco26% TSPThe in vitro chaperone activity of Trx m and f was demonstratedSanz-Barrio et al. (2011)HPV16 L1 antigen fused with LTBTobacco2% TSPProper folding and display of conformational epitopesWaheed et al. (2011a,b)Exendin 4 (EX4) fused to CTBTobacco14.3% TSPCTB–EX4 showed increased insulin secretion similar to the commercial EX4 in β-TC6Kwon et al. (2013)CTB–ESAT-6 (6 kDa early secretory antigenic target)

Tobacco

Lettuce

Up to 7.5%

0.75%

Hemolysis assay and GM1-binding assay confirmed functionality and structure of the ESAT-6 antigenLakshmi et al. (2013)CTB–Mtb72F (a fusion polyprotein from two tuberculosis antigens, Mtb32 and 39)TobaccoUp to 1.2%Not reportedLakshmi et al. (2013)CTB fused to MBP (myelin basic protein)Tobacco2% TSPAmyloid loads were reduced in vivo in brain regions of 3×TgAD mice fed with bioencapsulated CTB–MBP. Aβ(1–42) accumulation was reduced in retinae and loss of retinal ganglion cells was prevented in 3×TgAD mice treated with CTB–MBPKohli et al. (2014)Coagulation factor VIII (FVIII) antigens: heavy chain (HC) and C2 fused with CTBTobacco80 or 370 μg/g in fresh leavesFeeding of the HC/C2 mixture substantially suppressed T helper cell responses and inhibitor formation against FVIII in hemophilia A miceSherman et al. (2014)

TSP, total soluble protein.

View chapterPurchase book

Synthetic Biology, Part B

Harris H. Wang, George M. Church, in Methods in Enzymology, 2011

4 Concluding Remarks

Recombineering-based genome engineering provides a powerful approach for constructing and modifying chromosomes synthetically. As the cost of oligonucleotide synthesis continues to drop and automation capacities continue to expand, efficient "on-the-fly" manipulation of a living organism's genome will continue to improve. With the MAGE platform, existing genomic templates are used as scaffolds to produce newly engineered variants. An important aspect of template-based genome engineering is the benefit from the natural selection process as new genomes evolve by directed steps from existing functional genomes. Genome engineering approaches coupled with de novo synthesis methods (Chan et al., 2005; Gibson et al., 2010; Menzella et al., 2005; Tian et al., 2004) will continue to offer an expanding capability to engineer living organisms at the resolution of single nucleotides, but scaled across the entire genome and beyond.

View chapterPurchase book

CRISPR-Cas9 system for fungi genome engineering toward industrial applications

Lakkakula Satish, ... Yaron Sitrit, in Genome Engineering via CRISPR-Cas9 System, 2020

6.1 Introduction

The evolution of genome engineering approaches has made significant advancement in the past ten years (Glass et al., 2018). Formerly constrained to specific model organisms and ineffective at most, genome engineering started to develop into more predominantly useful through the exploration of various programmable DNA-binding proteins. Major among these tools are zinc finger nucleases (ZFNs) and transcription activator like effector nucleases (TALENs) (Chen et al., 2014). By combining such proteins to nucleases it is feasible to establish a double strand break (DSB) at a certain region in the genome sequence, that would again be repaired either by non-homologous end joining (NHEJ), frequently preeminent to a small insertions or deletions and thus gene knock-out, or homology directed recombination (HDR) with a desired gene, projecting to insertion of a desired DNA template (Miller et al., 2010; Urnov et al., 2005). Clustered regularly interspaced short palindromic repeat (CRISPR) and CRISPR-associated protein 9 (Cas9) method was found in nature as a microbial adaptive host defense system, it has been remodeled into a key tool for genome engineering (Glass et al., 2018). CRISPR-Cas9 approach is balanced to transform developmental biology through an efficient and simple tool to modify the genome of virtually any developing organism precisely (Harrison et al., 2014; Komor et al., 2017). When DNA strand is cleaved using a Cas9 nuclease, it is feasible to be repaired through either NHEJ or HDR pathways. But, the exact mechanism underlying these repair pathways is unclear (Harrison et al., 2014).

Traditional approaches for genome editing are however mostly inefficient. The CRISPR-Cas9 method has been extensively reported in bacteria, plants and mammals and has become a quick, simple and adaptable tool for systemic research on various fungi (Deng et al., 2017a). CRISPR-Cas9 method has empowered genome engineering in plentiful industrially significant organisms and allocated genetic systems that were inaccessible formerly (Donohoue et al., 2018). This chapter aims to focus on the current progresses of the CRISPR-Cas9 mediated genome engineering methods for targeted genome editing and its impending approaches in industrially important fungi.

View chapterPurchase book

Gene Editing

David A. Dunn, Carl A. Pinkert, in Transgenic Animal Technology (Third Edition), 2014

VI Summary

The advent of genome engineering technology has opened avenues not previously available and generally made reverse genetics-based experimentation considerably simpler. While the use of HR in murine ES cells has been the workhorse of gene targeting experimentation for many years, it is a labor-intensive and inefficient process (Dunn et al., 2005, 2012; Martin et al., 2010). Efficiency of mouse transgenesis is increased compared to traditional ES methods with genome editing constructs through knockout NHEJ-mediated DSB repair (Carbery et al., 2010; Sung et al., 2013), as well as knock-in HR experiments where coinjection of a knock-in construct bearing homology arms with the gene editing construct is carried out (Cui et al., 2011). Multiple factors contribute to this increased efficiency. Use of ES cells is eliminated, since the construct (usually mRNA) is injected into unicellular zygotes. Targeting events usually occur at the single-cell stage or soon thereafter. Therefore, transgenic animals are more likely to carry targeted mutations in their germ cells than chimeric mice derived from ES cells. Additionally, in knock-in models, HR occurs at much higher rates in the presence of either a DSB or a nick on one of the strands of DNA. Another appealing aspect of creating knockout mice in this fashion is the direct cytoplasmic injection of mRNA, eliminating the ES-cell stage entirely.

Perhaps more excitingly, these techniques make gene targeting possible in a wide range of species for the first time. For many years, the mouse was the only species in which in vivo knockout models were possible. The development of induced pluripotent stem cells in nonmurine species partially alleviated that situation (Hamanaka et al., 2011). Nevertheless, not until the creation of engineered endonucleases did the ability to ablate gene expression in a diverse array of animals come to reality.

Genome engineering has quickly progressed over the past few years, from a future ideal for which more research was required, to a readily available technique accessible to any laboratory with basic molecular cloning capabilities. Nevertheless, as a very young field, refinements of existing techniques and discovery of new and enhanced approaches are certain. The fast pace of advances and the excitement generated just over the last few years show no signs of abating. Gene-targeted animal models for a range of biomedical and agricultural applications can now be established.

View chapterPurchase book

Recommended publications:

Current Opinion in Biotechnology

Journal

Metabolic Engineering

Journal

Cell

Journal

Methods

Journal

Browse Journals & Books

About ScienceDirect

Remote access

Shopping cart

Advertise

Contact and support

Terms and conditions

Privacy policy