2.  Total gastrectomy increases the incidence of grade III and IV toxicities in patients with gastric cancer receiving adjuvant TS-1 treatment 
We aimed to evaluate the safety and efficacy of TS-1 adjuvant chemotherapy in Taiwanese patients with gastric cancer.
We included in this study patients with locally advanced gastric cancer who received adjuvant TS-1 or 5-fluorouracil chemotherapy after curative surgery and extended lymph node dissection between 1 June 2008 and 31 December 2012 at Chang Gung Memorial Hospital. Patient characteristics, tumor features, safety profiles and compliance with TS-1 treatment were retrospectively analyzed from medical charts.
Forty patients received adjuvant chemotherapy with TS-1 and 193 with 5-fluorouracil within the study period. The 1- and 2-year overall survival rates were 90.6% and 87% in the TS-1 group and 95.4% and 86.8% in the 5-fluorouracil group (P = 0.34). The 1- and 2-year disease-free survival rates were 90.6% and 74.7% in the TS-1 group and 88% and 75.7% in the 5-fluorouracil group (P = 0.66). In the TS-1 group, tumor recurrence was more frequent in those with >15 metastatic lymph nodes than ≤15. Overall, 78.9%, 74.3%, 62.1% and 56% of patients underwent TS-1 treatment for at least 3, 6, 9 and 12 months, respectively. The most common adverse events of TS-1 were skin hyperpigmentation (55%), diarrhea (27.5%), dizziness (27.5%) and leucopenia (20%). Severe adverse events (SAEs; grade III or IV toxicity) were diarrhea (7.5%), stomatitis (7.5%), leukopenia (5%), vomiting (2.5%), anorexia (2.5%) and dizziness (2.5%). Patients who underwent total gastrectomy had a significantly greater risk of TS-1-related SAEs than patients who underwent subtotal gastrectomy (40% versus 8%, P = 0.014).
The incidence of SAEs during TS-1 therapy was more common in Taiwanese patients with gastric cancer who underwent total gastrectomy compared with those who underwent subtotal gastrectomy. Clinicians must be aware of and able to manage these SAEs to maximize patient compliance with adjuvant TS-1.
PMCID: PMC4228399  PMID: 24180462
Adjuvant chemotherapy; Compliance; Gastric cancer; Safety profile; TS-1
3.  Longitudinal Changes and Predictors of Caregiving Burden While Providing End-of-Life Care for Terminally Ill Cancer Patients 
Journal of Palliative Medicine  2013;16(6):632-637.
The effect of caring for a dying cancer patient on caregiving burden has been explored primarily in Western-based studies with small samples or in studies that did not follow up until the patient's death, but has not yet been investigated in Taiwan.
The study's goals were (1) to identify the trajectory of caregiving burden for family caregivers (FCs) of terminally ill cancer patients in Taiwan, and (2) to investigate the determinants of caregiving burden in a large sample and with longitudinal follow-ups, until the patient's death.
A prospective, longitudinal study was conducted among 193 FCs. The trajectory and determinants of caregiving burden were identified by a generalized estimation equation approach.
Caregiving burden did not change as the patient's death approached. FCs experienced heavy caregiving burden when their relative suffered from greater symptom distress or if they were spousal caregivers; provided high intensity of assistance to the patient while spending fewer hours providing care; reported financial insufficiency; or had lower social support, fewer psychological resources, or less confidence in caregiving.
Taiwanese family caregivers' carrry moderate caregiving burden which did not change significantly as the patients' death approached. The effects of caregiving burden while providing EOL care to terminally ill cancer patients may be tempered substantially by enhancing family caregivers caregiving confidence, social support, and psychological resources.
PMCID: PMC3667422  PMID: 23556989
4.  Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species 
Bradnam, Keith R | Fass, Joseph N | Alexandrov, Anton | Baranay, Paul | Bechner, Michael | Birol, Inanç | Boisvert, Sébastien | Chapman, Jarrod A | Chapuis, Guillaume | Chikhi, Rayan | Chitsaz, Hamidreza | Chou, Wen-Chi | Corbeil, Jacques | Del Fabbro, Cristian | Docking, T Roderick | Durbin, Richard | Earl, Dent | Emrich, Scott | Fedotov, Pavel | Fonseca, Nuno A | Ganapathy, Ganeshkumar | Gibbs, Richard A | Gnerre, Sante | Godzaridis, Élénie | Goldstein, Steve | Haimel, Matthias | Hall, Giles | Haussler, David | Hiatt, Joseph B | Ho, Isaac Y | Howard, Jason | Hunt, Martin | Jackman, Shaun D | Jaffe, David B | Jarvis, Erich D | Jiang, Huaiyang | Kazakov, Sergey | Kersey, Paul J | Kitzman, Jacob O | Knight, James R | Koren, Sergey | Lam, Tak-Wah | Lavenier, Dominique | Laviolette, François | Li, Yingrui | Li, Zhenyu | Liu, Binghang | Liu, Yue | Luo, Ruibang | MacCallum, Iain | MacManes, Matthew D | Maillet, Nicolas | Melnikov, Sergey | Naquin, Delphine | Ning, Zemin | Otto, Thomas D | Paten, Benedict | Paulo, Octávio S | Phillippy, Adam M | Pina-Martins, Francisco | Place, Michael | Przybylski, Dariusz | Qin, Xiang | Qu, Carson | Ribeiro, Filipe J | Richards, Stephen | Rokhsar, Daniel S | Ruby, J Graham | Scalabrin, Simone | Schatz, Michael C | Schwartz, David C | Sergushichev, Alexey | Sharpe, Ted | Shaw, Timothy I | Shendure, Jay | Shi, Yujian | Simpson, Jared T | Song, Henry | Tsarev, Fedor | Vezzi, Francesco | Vicedomini, Riccardo | Vieira, Bruno M | Wang, Jun | Worley, Kim C | Yin, Shuangye | Yiu, Siu-Ming | Yuan, Jianying | Zhang, Guojie | Zhang, Hao | Zhou, Shiguo | Korf, Ian F
GigaScience  2013;2:10.
The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly.
In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies.
Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.
PMCID: PMC3844414  PMID: 23870653
Genome assembly; N50; Scaffolds; Assessment; Heterozygosity; COMPASS
5.  Comparing end-of-life care in hospitalized patients with chronic obstructive pulmonary disease with and without palliative care in Taiwan 
We investigated the difference of clinical practice pattern between end-stage chronic obstructive pulmonary disease (COPD) patients with and without palliative care at the end of life in Taiwan.
Materials and Methods:
A total of 91 COPD patients who died in an acute care hospital were enrolled from one community teaching hospital in northern Taiwan between September 1, 2007 and December 31, 2009. The patients were divided into palliative (n = 17) and non-palliative care (n = 74) groups. Demographics and medical care data obtained through retrospective review of medical records were analyzed to determine significant between-group differences.
There were no between-group differences in intensive care unit (ICU) utilization, duration of ICU stay, duration of ventilator usage, invasive diagnostic procedures, invasive treatments, medications, and total medical cost. Patients in the palliative group had longer hospital stays (median 26 days vs. 11 days, P < 0.01) and higher rate of do-not-resuscitate orders (100% vs. 51%, P < 0.001), but lower rates of ICU mortality (73% vs. 41%, P = 0.026), invasive ventilation (57% vs. 29%, P = 0.04), cardiopulmonary resuscitation (12% vs. 51%, P < 0.001), and daily medical cost (250 US dollars vs. 444 US dollars, P < 0.001).
Palliative care was underutilized and referral was delayed for COPD patients. COPD patients are polysymptomatic approaching the end of life and this characteristic should be taken into account in providing appropriate end-of-life care in the same way as for cancer patients. Palliative care for COPD patients is urgently needed in Taiwan and should be promoted.
PMCID: PMC3897028  PMID: 24516493
Chronic obstructive pulmonary disease; end-of-life; palliative care
6.  Transcriptome Sequencing and Annotation for the Jamaican Fruit Bat (Artibeus jamaicensis) 
PLoS ONE  2012;7(11):e48472.
The Jamaican fruit bat (Artibeus jamaicensis) is one of the most common bats in the tropical Americas. It is thought to be a potential reservoir host of Tacaribe virus, an arenavirus closely related to the South American hemorrhagic fever viruses. We performed transcriptome sequencing and annotation from lung, kidney and spleen tissues using 454 and Illumina platforms to develop this species as an animal model. More than 100,000 contigs were assembled, with 25,000 genes that were functionally annotated. Of the remaining unannotated contigs, 80% were found within bat genomes or transcriptomes. Annotated genes are involved in a broad range of activities ranging from cellular metabolism to genome regulation through ncRNAs. Reciprocal BLAST best hits yielded 8,785 sequences that are orthologous to mouse, rat, cattle, horse and human. Species tree analysis of sequences from 2,378 loci was used to achieve 95% bootstrap support for the placement of bat as sister to the clade containing horse, dog, and cattle. Through substitution rate estimation between bat and human, 32 genes were identified with evidence for positive selection. We also identified 466 immune-related genes, which may be useful for studying Tacaribe virus infection of this species. The Jamaican fruit bat transcriptome dataset is a resource that should provide additional candidate markers for studying bat evolution and ecology, and tools for analysis of the host response and pathology of disease.
PMCID: PMC3499531  PMID: 23166587
7.  Encoding of Physics Concepts: Concreteness and Presentation Modality Reflected by Human Brain Dynamics 
PLoS ONE  2012;7(7):e41784.
Previous research into working memory has focused on activations in different brain areas accompanying either different presentation modalities (verbal vs. non-verbal) or concreteness (abstract vs. concrete) of non-science concepts. Less research has been conducted investigating how scientific concepts are learned and further processed in working memory. To bridge this gap, the present study investigated human brain dynamics associated with encoding of physics concepts, taking both presentation modality and concreteness into account. Results of this study revealed greater theta and low-beta synchronization in the anterior cingulate cortex (ACC) during encoding of concrete pictures as compared to the encoding of both high and low imageable words. In visual brain areas, greater theta activity accompanying stimulus onsets was observed for words as compared to pictures while stronger alpha suppression was observed in responses to pictures as compared to words. In general, the EEG oscillation patterns for encoding words of different levels of abstractness were comparable but differed significantly from encoding of pictures. These results provide insights into the effects of modality of presentation on human encoding of scientific concepts and thus might help in developing new ways to better teach scientific concepts in class.
PMCID: PMC3407070  PMID: 22848602
8.  EEG Dynamics Reflect the Distinct Cognitive Process of Optic Problem Solving 
PLoS ONE  2012;7(7):e40731.
This study explores the changes in electroencephalographic (EEG) activity associated with the performance of solving an optics maze problem. College students (N = 37) were instructed to construct three solutions to the optical maze in a Web-based learning environment, which required some knowledge of physics. The subjects put forth their best effort to minimize the number of convexes and mirrors needed to guide the image of an object from the entrance to the exit of the maze. This study examines EEG changes in different frequency bands accompanying varying demands on the cognitive process of providing solutions. Results showed that the mean power of θ, α1, α2, and β1 significantly increased as the number of convexes and mirrors used by the students decreased from solution 1 to 3. Moreover, the mean power of θ and α1 significantly increased when the participants constructed their personal optimal solution (the least total number of mirrors and lens used by students) compared to their non-personal optimal solution. In conclusion, the spectral power of frontal, frontal midline and posterior theta, posterior alpha, and temporal beta increased predominantly as the task demands and task performance increased.
PMCID: PMC3398019  PMID: 22815800
9.  An integrated transcriptomic and computational analysis for biomarker identification in gastric cancer 
Nucleic Acids Research  2010;39(4):1197-1207.
This report describes an integrated study on identification of potential markers for gastric cancer in patients’ cancer tissues and sera based on: (i) genome-scale transcriptomic analyses of 80 paired gastric cancer/reference tissues and (ii) computational prediction of blood-secretory proteins supported by experimental validation. Our findings show that: (i) 715 and 150 genes exhibit significantly differential expressions in all cancers and early-stage cancers versus reference tissues, respectively; and a substantial percentage of the alteration is found to be influenced by age and/or by gender; (ii) 21 co-expressed gene clusters have been identified, some of which are specific to certain subtypes or stages of the cancer; (iii) the top-ranked gene signatures give better than 94% classification accuracy between cancer and the reference tissues, some of which are gender-specific; and (iv) 136 of the differentially expressed genes were predicted to have their proteins secreted into blood, 81 of which were detected experimentally in the sera of 13 validation samples and 29 found to have differential abundances in the sera of cancer patients versus controls. Overall, the novel information obtained in this study has led to identification of promising diagnostic markers for gastric cancer and can benefit further analyses of the key (early) abnormalities during its development.
PMCID: PMC3045610  PMID: 20965966
10.  GolgiP: prediction of Golgi-resident proteins in plants 
Bioinformatics  2010;26(19):2464-2465.
Summary: We present a novel Golgi-prediction server, GolgiP, for computational prediction of both membrane- and non-membrane-associated Golgi-resident proteins in plants. We have employed a support vector machine-based classification method for the prediction of such Golgi proteins, based on three types of information, dipeptide composition, transmembrane domain(s) (TMDs) and functional domain(s) of a protein, where the functional domain information is generated through searching against the Conserved Domains Database, and the TMD information includes the number of TMDs, the length of TMD and the number of TMDs at the N-terminus of a protein. Using GolgiP, we have made genome-scale predictions of Golgi-resident proteins in 18 plant genomes, and have made the preliminary analysis of the predicted data.
Availability: The GolgiP web service is publically available at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2944200  PMID: 20733061
11.  Genome Sequence of the Anaerobic, Thermophilic, and Cellulolytic Bacterium “Anaerocellum thermophilum” DSM 6725▿  
Journal of Bacteriology  2009;191(11):3760-3761.
“Anaerocellum thermophilum” DSM 6725 is a strictly anaerobic bacterium that grows optimally at 75°C. It uses a variety of polysaccharides, including crystalline cellulose and untreated plant biomass, and has potential utility in biomass conversion. Here we report its complete genome sequence of 2.97 Mb, which is contained within one chromosome and two plasmids (of 8.3 and 3.6 kb). The genome encodes a broad set of cellulolytic enzymes, transporters, and pathways for sugar utilization and compared to those of other saccharolytic, anaerobic thermophiles is most similar to that of Caldicellulosiruptor saccharolyticus DSM 8903.
PMCID: PMC2681903  PMID: 19346307
12.  BIOSMILE: A semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features 
BMC Bioinformatics  2007;8:325.
Bioinformatics tools for automatic processing of biomedical literature are invaluable for both the design and interpretation of large-scale experiments. Many information extraction (IE) systems that incorporate natural language processing (NLP) techniques have thus been developed for use in the biomedical field. A key IE task in this field is the extraction of biomedical relations, such as protein-protein and gene-disease interactions. However, most biomedical relation extraction systems usually ignore adverbial and prepositional phrases and words identifying location, manner, timing, and condition, which are essential for describing biomedical relations. Semantic role labeling (SRL) is a natural language processing technique that identifies the semantic roles of these words or phrases in sentences and expresses them as predicate-argument structures. We construct a biomedical SRL system called BIOSMILE that uses a maximum entropy (ME) machine-learning model to extract biomedical relations. BIOSMILE is trained on BioProp, our semi-automatic, annotated biomedical proposition bank. Currently, we are focusing on 30 biomedical verbs that are frequently used or considered important for describing molecular events.
To evaluate the performance of BIOSMILE, we conducted two experiments to (1) compare the performance of SRL systems trained on newswire and biomedical corpora; and (2) examine the effects of using biomedical-specific features. The experimental results show that using BioProp improves the F-score of the SRL system by 21.45% over an SRL system that uses a newswire corpus. It is noteworthy that adding automatically generated template features improves the overall F-score by a further 0.52%. Specifically, ArgM-LOC, ArgM-MNR, and Arg2 achieve statistically significant performance improvements of 3.33%, 2.27%, and 1.44%, respectively.
We demonstrate the necessity of using a biomedical proposition bank for training SRL systems in the biomedical domain. Besides the different characteristics of biomedical and newswire sentences, factors such as cross-domain framesets and verb usage variations also influence the performance of SRL systems. For argument classification, we find that NE (named entity) features indicating if the target node matches with NEs are not effective, since NEs may match with a node of the parsing tree that does not have semantic role labels in the training set. We therefore incorporate templates composed of specific words, NE types, and POS tags into the SRL system. As a result, the classification accuracy for adjunct arguments, which is especially important for biomedical SRL, is improved significantly.
PMCID: PMC2072962  PMID: 17764570
13.  Various criteria in the evaluation of biomedical named entity recognition 
BMC Bioinformatics  2006;7:92.
Text mining in the biomedical domain is receiving increasing attention. A key component of this process is named entity recognition (NER). Generally speaking, two annotated corpora, GENIA and GENETAG, are most frequently used for training and testing biomedical named entity recognition (Bio-NER) systems. JNLPBA and BioCreAtIvE are two major Bio-NER tasks using these corpora. Both tasks take different approaches to corpus annotation and use different matching criteria to evaluate system performance. This paper details these differences and describes alternative criteria. We then examine the impact of different criteria and annotation schemes on system performance by retesting systems participated in the above two tasks.
To analyze the difference between JNLPBA's and BioCreAtIvE's evaluation, we conduct Experiment 1 to evaluate the top four JNLPBA systems using BioCreAtIvE's classification scheme. We then compare them with the top four BioCreAtIvE systems. Among them, three systems participated in both tasks, and each has an F-score lower on JNLPBA than on BioCreAtIvE. In Experiment 2, we apply hypothesis testing and correlation coefficient to find alternatives to BioCreAtIvE's evaluation scheme. It shows that right-match and left-match criteria have no significant difference with BioCreAtIvE. In Experiment 3, we propose a customized relaxed-match criterion that uses right match and merges JNLPBA's five NE classes into two, which achieves an F-score of 81.5%. In Experiment 4, we evaluate a range of five matching criteria from loose to strict on the top JNLPBA system and examine the percentage of false negatives. Our experiment gives the relative change in precision, recall and F-score as matching criteria are relaxed.
In many applications, biomedical NEs could have several acceptable tags, which might just differ in their left or right boundaries. However, most corpora annotate only one of them. In our experiment, we found that right match and left match can be appropriate alternatives to JNLPBA and BioCreAtIvE's matching criteria. In addition, our relaxed-match criterion demonstrates that users can define their own relaxed criteria that correspond more realistically to their application requirements.
PMCID: PMC1402329  PMID: 16504116

