As part of our Tenth Anniversary PLOS Biology Collection, PLOS' director of advocacy, Cameron Neylon, expounds on the need to improve and focus on our sharing infrastructure to maximize the reach of research communication.
It is unlikely that there is any single objective measure of merit, so research assessment therefore requires new multivariate metrics that reflect the context of research, regardless of discipline.
The electronic laboratory notebook (ELN) has the potential to replace the paper notebook with a marked-up digital record that can be searched and shared. However, it is a challenge to achieve these benefits without losing the usability and flexibility of traditional paper notebooks. We investigate a blog-based platform that addresses the issues associated with the development of a flexible system for recording scientific research.
We chose a blog-based approach with the journal characteristics of traditional notebooks in mind, recognizing the potential for linking together procedures, materials, samples, observations, data, and analysis reports. We implemented the LabTrove blog system as a server process written in PHP, using a MySQL database to persist posts and other research objects. We incorporated a metadata framework that is both extensible and flexible while promoting consistency and structure where appropriate. Our experience thus far is that LabTrove is capable of providing a successful electronic laboratory recording system.
LabTrove implements a one-item one-post system, which enables us to uniquely identify each element of the research record, such as data, samples, and protocols. This unique association between a post and a research element affords advantages for monitoring the use of materials and samples and for inspecting research processes. The combination of the one-item one-post system, consistent metadata, and full-text search provides us with a much more effective record than a paper notebook. The LabTrove approach provides a route towards reconciling the tensions and challenges that lie ahead in working towards the long-term goals for ELNs. LabTrove, an electronic laboratory notebook (ELN) system from the Smart Research Framework, based on a blog-type framework with full access control, facilitates the scientific experimental recording requirements for reproducibility, reuse, repurposing, and redeployment.
To truly gain the benefits of open access, we need to look beyond “access" and ensure that open-access publishing enables re-use, legally and technically, to fully exploit opportunities provided by the worldwide web.
Pathogen genetics is already a mainstay of public health investigation and control efforts; now advances in technology make it possible to investigate the role of human genetic variation in the epidemiology of infectious diseases. To describe trends in this field, we analyzed articles that were published from 2001 through 2010 and indexed by the HuGE Navigator, a curated online database of PubMed abstracts in human genome epidemiology. We extracted the principal findings from all meta-analyses and genome-wide association studies (GWAS) with an infectious disease-related outcome. Finally, we compared the representation of diseases in HuGE Navigator with their contributions to morbidity worldwide. We identified 3,730 articles on infectious diseases, including 27 meta-analyses and 23 GWAS. The number published each year increased from 148 in 2001 to 543 in 2010 but remained a small fraction (about 7%) of all studies in human genome epidemiology. Most articles were by authors from developed countries, but the percentage by authors from resource-limited countries increased from 9% to 25% during the period studied. The most commonly studied diseases were HIV/AIDS, tuberculosis, hepatitis B infection, hepatitis C infection, sepsis, and malaria. As genomic research methods become more affordable and accessible, population-based research on infectious diseases will be able to examine the role of variation in human as well as pathogen genomes. This approach offers new opportunities for understanding infectious disease susceptibility, severity, treatment, control, and prevention.
Assessing an individual's research impact on the basis of a transparent algorithm is an important task for evaluation and comparison purposes. Besides simple but also inaccurate indices such as counting the mere number of publications or the accumulation of overall citations, and highly complex but also overwhelming full-range publication lists in their raw format, Hirsch (2005) introduced a single figure cleverly combining different approaches. The so-called h-index has undoubtedly become the standard in scientometrics of individuals' research impact (note: in the present paper I will always use the term “research impact” to describe the research performance as the logic of the paper is based on the h-index, which quantifies the specific “impact” of, e.g., researchers, but also because the genuine meaning of impact refers to quality as well). As the h-index reflects the number h of papers a researcher has published with at least h citations, the index is inherently positively biased towards senior level researchers. This might sometimes be problematic when predictive tools are needed for assessing young scientists' potential, especially when recruiting early career positions or equipping young scientists' labs. To be compatible with the standard h-index, the proposed index integrates the scientist's research age (Carbon_h-factor) into the h-index, thus reporting the average gain of h-index per year. Comprehensive calculations of the Carbon_h-factor were made for a broad variety of four research-disciplines (economics, neuroscience, physics and psychology) and for researchers performing on three high levels of research impact (substantial, outstanding and epochal) with ten researchers per category. For all research areas and output levels we obtained linear developments of the h-index demonstrating the validity of predicting one's later impact in terms of research impact already at an early stage of their career with the Carbon_h-factor being approx. 0.4, 0.8, and 1.5 for substantial, outstanding and epochal researchers, respectively.
LPXTG proteins, present in most if not all Gram-positive bacteria, are known to be anchored by sortases to the bacterial peptidoglycan. More than one sortase gene is often encoded in a bacterial species, and each sortase is supposed to specifically anchor given LPXTG proteins, depending of the sequence of the C-terminal cell wall sorting signal (cwss), bearing an LPXTG motif or another recognition sequence. B. anthracis possesses three sortase genes. B. anthracis sortase deleted mutant strains are not affected in their virulence. To determine the sortase repertoires, we developed a genetic screen using the property of the gamma phage to lyse bacteria only when its receptor, GamR, an LPXTG protein, is exposed at the surface. We identified 10 proteins that contain a cell wall sorting signal and are covalently anchored to the peptidoglycan. Some chimeric proteins yielded phage lysis in all sortase mutant strains, suggesting that cwss proteins remained surface accessible in absence of their anchoring sortase, probably as a consequence of membrane localization of yet uncleaved precursor proteins. For definite assignment of the sortase repertoires, we consequently relied on a complementary test, using a biochemical approach, namely immunoblot experiments. The sortase anchoring nine of these proteins has thus been determined. The absence of virulence defect of the sortase mutants could be a consequence of the membrane localization of the cwss proteins.
In this piece I would like to tell a few stories; three stories to be precise. Firstly I want to explain where I am, where I've come from and what has led me to the views that I hold today. I find myself at an interesting point in my life and career at the same point as the research community is undergoing massive change. The second story is one of what the world might look like at some point in the future. What might we achieve? What might it look like? And what will be possible? Finally I want to ask the question of how we get there from here. What is the unifying idea or movement that actually has the potential to carry us forward in a positive way? At the end of this I'm going to ask you, the reader, to commit to something as part of the process of making that happen.
Many initiatives encourage investigators to share their raw datasets in hopes of increasing research efficiency and quality. Despite these investments of time and money, we do not have a firm grasp of who openly shares raw research data, who doesn't, and which initiatives are correlated with high rates of data sharing. In this analysis I use bibliometric methods to identify patterns in the frequency with which investigators openly archive their raw gene expression microarray datasets after study publication.
Automated methods identified 11,603 articles published between 2000 and 2009 that describe the creation of gene expression microarray data. Associated datasets in best-practice repositories were found for 25% of these articles, increasing from less than 5% in 2001 to 30%–35% in 2007–2009. Accounting for sensitivity of the automated methods, approximately 45% of recent gene expression studies made their data publicly available.
First-order factor analysis on 124 diverse bibliometric attributes of the data creation articles revealed 15 factors describing authorship, funding, institution, publication, and domain environments. In multivariate regression, authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants. Authors of studies on cancer and human subjects were least likely to make their datasets available.
These results suggest research data sharing levels are still low and increasing only slowly, and data is least available in areas where it could make the biggest impact. Let's learn from those with high rates of sharing to embrace the full potential of our research output.
Scientific research in the 21st century is more data intensive and collaborative than in the past. It is important to study the data practices of researchers – data accessibility, discovery, re-use, preservation and, particularly, data sharing. Data sharing is a valuable part of the scientific method allowing for verification of results and extending research from prior results.
A total of 1329 scientists participated in this survey exploring current data sharing practices and perceptions of the barriers and enablers of data sharing. Scientists do not make their data electronically available to others for various reasons, including insufficient time and lack of funding. Most respondents are satisfied with their current processes for the initial and short-term parts of the data or research lifecycle (collecting their research data; searching for, describing or cataloging, analyzing, and short-term storage of their data) but are not satisfied with long-term data preservation. Many organizations do not provide support to their researchers for data management both in the short- and long-term. If certain conditions are met (such as formal citation and sharing reprints) respondents agree they are willing to share their data. There are also significant differences and approaches in data management practices based on primary funding agency, subject discipline, age, work focus, and world region.
Barriers to effective data sharing and preservation are deeply rooted in the practices and culture of the research process as well as the researchers themselves. New mandates for data management plans from NSF and other federal agencies and world-wide attention to the need to share and preserve data could lead to changes. Large scale programs, such as the NSF-sponsored DataNET (including projects like DataONE) will both bring attention and resources to the issue and make it easier for scientists to apply sound data management principles.
Directed protein evolution has been used to modify protein activity and research has been carried out to enhance the production of high quality mutant libraries. Many theoretical approaches suggest that allowing a population to undergo neutral selection may be valuable in directed evolution experiments.
Here we report on an investigation into the value of neutral selection in a classical model system for directed evolution, the conversion of the E. coli β-glucuronidase to a β-galactosidase activity. We find that neutral selection, i.e. selection for retaining glucuronidase activity, can efficiently identify the majority of sites of mutation that have been identified as beneficial for galactosidase activity in previous experiments. Each variant demonstrating increased galactosidase activity identified by our neutral drift experiments contained a mutation at one of four sites, T509, S557, N566 or W529. All of these sites have previously been identified using direct selection for beta galactosidase activity.
Our results are consistent with others that show that a neutral selection approach can be effective in selecting improved variants. However, we interpret our results to show that neutral selection is, in this case, not a more efficient approach than conventional directed evolution approaches. However, the neutral approach is likely to be beneficial when the resulting library can be screened for a range of related activities. More detailed statistical studies to resolve the apparent differences between this system and others are likely to be a fruitful avenue for future research.
LC8 dynein light chain (DYNLL) is a eukaryotic hub protein that is thought to function as a dimerization engine. Its interacting partners are involved in a wide range of cellular functions. In its dozens of hitherto identified binding partners DYNLL binds to a linear peptide segment. The known segments define a loosely characterized binding motif: [D/S]-4K-3X-2[T/V/I]-1Q0[T/V]1[D/E]2. The motifs are localized in disordered segments of the DYNLL-binding proteins and are often flanked by coiled coil or other potential dimerization domains. Based on a directed evolution approach, here we provide the first quantitative characterization of the binding preference of the DYNLL binding site. We displayed on M13 phage a naïve peptide library with seven fully randomized positions around a fixed, naturally conserved glutamine. The peptides were presented in a bivalent manner fused to a leucine zipper mimicking the natural dimer to dimer binding stoichiometry of DYNLL-partner complexes. The phage-selected consensus sequence V-5S-4R-3G-2T-1Q0T1E2 resembles the natural one, but is extended by an additional N-terminal valine, which increases the affinity of the monomeric peptide twentyfold. Leu-zipper dimerization increases the affinity into the subnanomolar range. By comparing crystal structures of an SRGTQTE-DYNLL and a dimeric VSRGTQTE-DYNLL complex we find that the affinity enhancing valine is accommodated in a binding pocket on DYNLL. Based on the in vitro evolved sequence pattern we predict a large number of novel DYNLL binding partners in the human proteome. Among these EML3, a microtubule-binding protein involved in mitosis contains an exact match of the phage-evolved consensus and binds to DYNLL with nanomolar affinity. These results significantly widen the scope of the human interactome around DYNLL and will certainly shed more light on the biological functions and organizing role of DYNLL in the human and other eukaryotic interactomes.
Increased competition for research funding has led to growth in proposal submissions and lower funding-success rates. An agent-based model of the funding cycle, accounting for variations in program officer and reviewer behaviors, for a range of funding rates, is used to assess the efficiency of different proposal-submission strategies. Program officers who use more reviewers and require consensus can improve the chances of scientists submitting fewer proposals. Selfish or negligent reviewers reduce the effectiveness of submitting more proposals, but have less influence as available funding declines. Policies designed to decrease proposal submissions reduce reviewer workload, but can lower the quality of funded proposals. When available funding falls below 10–15% in this model, the most effective strategy for scientists to maintain funding is to submit many proposals.
Phage display is a platform for selection of specific binding molecules and this is a clear-cut motivation for increasing its performance. Polypeptides are normally displayed as fusions to the major coat protein VIII (pVIII), or the minor coat protein III (pIII). Display on other coat proteins such as pVII allows for display of heterologous peptide sequences on the virions in addition to those displayed on pIII and pVIII. In addition, pVII display is an alternative to pIII or pVIII display.
Here we demonstrate how standard pIII or pVIII display phagemids are complemented with a helper phage which supports production of virions that are tagged with octa FLAG, HIS6 or AviTag on pVII. The periplasmic signal sequence required for pIII and pVIII display, and which has been added to pVII in earlier studies, is omitted altogether.
Tagging on pVII is an important and very useful add-on feature to standard pIII and pVII display. Any phagemid bearing a protein of interest on either pIII or pVIII can be tagged with any of the tags depending simply on choice of helper phage. We show in this paper how such tags may be utilized for immobilization and separation as well as purification and detection of monoclonal and polyclonal phage populations.
Phage display is a leading technology for selection of binders with affinity for specific target molecules. Polypeptides are normally displayed as fusions to the major coat protein VIII (pVIII) or the minor coat protein III (pIII). Whereas pVIII display suffers from drawbacks such as heterogeneity in display levels and polypeptide fusion size limitations, toxicity and infection interference effects have been described for pIII display. Thus, display on other coat proteins such as pVII or pIX might be more attractive. Neither pVII nor pIX display have gained widespread use or been characterized in detail like pIII and pVIII display.
Here we present a side-by-side comparison of display on pIII with display on pVII and pIX. Polypeptides of interest (POIs) are fused to pVII or pIX. The N-terminal periplasmic signal sequence, which is required for phage integration of pIII and pVIII and that has been added to pVII and pIX in earlier studies, is omitted altogether. Although the POI display level on pIII is higher than on pVII and pIX, affinity selection with pVII and pIX display libraries is shown to be particularly efficient.
Display through pVII and/or pIX represent platforms with characteristics that differ from those of the pIII platform. We have explored this to increase the performance and expand the use of phage display. In the paper, we describe effective affinity selection of folded domains displayed on pVII or pIX. This makes both platforms more attractive alternatives to conventional pIII and pVIII display than they were before.
In narrow pore ion channels, ions and water molecules diffuse in a single-file manner and cannot pass each other. Under such constraints, ion and water fluxes are coupled, leading to experimentally observable phenomena such as the streaming potential. Analysis of this coupled flux would provide unprecedented insights into the mechanism of permeation. In this study, ion and water permeation through the KcsA potassium channel was the focus, for which an eight-state discrete-state Markov model has been proposed based on the crystal structure, exhibiting four ion-binding sites. Random transitions on the model lead to the generation of the net flux. Here we introduced the concept of cycle flux to derive exact solutions of experimental observables from the permeation model. There are multiple cyclic paths on the model, and random transitions complete the cycles. The rate of cycle completion is called the cycle flux. The net flux is generated by a combination of cyclic paths with their own cycle flux. T.L. Hill developed a graphical method of exact solutions for the cycle flux. This method was extended to calculate one-way cycle fluxes of the KcsA channel. By assigning the stoichiometric numbers for ion and water transfer to each cycle, we established a method to calculate the water-ion coupling ratio (CRw-i) through cycle flux algebra. These calculations predicted that CRw-i would increase at low potassium concentrations. One envisions an intuitive picture of permeation as random transitions among cyclic paths, and the relative contributions of the cycle fluxes afford experimental observables.
Immobilized Metal Affinity Chromatography (IMAC) has been used for decades to purify proteins on the basis of amino acid content, especially surface-exposed histidines and “histidine tags” genetically added to recombinant proteins. We and others have extended the use of IMAC to purification of nucleic acids via interactions with the nucleotide bases, especially purines, of single-stranded RNA and DNA. We also have demonstrated the purification of plasmid DNA from contaminating genomic DNA by IMAC capture of selectively-denatured genomic DNA. Here we describe an efficient method of purifying PCR products by specifically removing error products, excess primers, and unincorporated dNTPs from PCR product mixtures using flow-through metal-chelate affinity adsorption. By flowing a PCR product mixture through a Cu2+-iminodiacetic acid (IDA) agarose spin column, 94–99% of the dNTPs and nearly all the primers can be removed. Many of the error products commonly formed by Taq polymerase also are removed. Sequencing of the IMAC-processed PCR product gave base-calling accuracy comparable to that obtained with a commercial PCR product purification method. The results show that IMAC matrices (specifically Cu2+-IDA agarose) can be used for the purification of PCR products. Due to the generality of the base-specific mechanism of adsorption, IMAC matrices may also be used in the purification of oligonucleotides, cDNA, mRNA and micro RNAs.
BmK IT2 is regarded as a receptor site-4 modulator of sodium channels with depressant insect toxicity. It also displays anti-nociceptive and anti-convulsant activities in rat models. In this study, the potency and efficacy of BmK IT2 were for the first time assessed and compared among four sodium channel isoforms expressed in Xenopus oocytes. Combined with molecular approach, the receptor site of BmK IT2 was further localized.
2 µM BmK IT2 strongly shifted the activation of DmNav1, the sodium channel from Drosophila, to more hyperpolarized potentials; whereas it hardly affected the gating properties of rNav1.2, rNav1.3 and mNav1.6, three mammalian central neuronal sodium channel subtypes. (1) Mutations of Glu896, Leu899, Gly904 in extracellular loop Domain II S3–S4 of DmNav1 abolished the functional action of BmK IT2. (2) BmK IT2-preference for DmNav1 could be conferred by Domain III. Analysis of subsequent DmNav1 mutants highlighted the residues in Domain III pore loop, esp. Ile1529 was critical for recognition and binding of BmK IT2.
In this study, BmK IT2 displayed total insect-selectivity. Two binding regions, comprising domains II and III of DmNav1, play separated but indispensable roles in the interaction with BmK IT2. The insensitivity of Nav1.2, Nav1.3 and Nav1.6 to BmK IT2 suggests other isoforms or mechanism might be involved in the suppressive activity of BmK IT2 in rat pathological models.
Ovarian cancer is the most lethal gynecological malignancy, and the ovarian clear cell carcinoma subtype (OCCA) demonstrates a particularly poor response to standard treatment. Improvements in ovarian cancer outcomes, especially for OCCA, could be expected from a clearer understanding of the molecular pathology that might guide strategies for earlier diagnosis and more effective treatment.
Cell-SELEX technology was employed to develop new molecular probes for ovarian cancer cell surface markers. A total of thirteen aptamers with Kd's to ovarian cancer cells in the pico- to nanomolar range were obtained. Preliminary investigation of the targets of these aptamers and their binding characteristics was also performed.
We have selected a series of aptamers that bind to different types of ovarian cancer, but not cervical cancer. Though binding to other cancer cell lines was observed, these aptamers could lead to identification of biomarkers that are related to cancer.
Seasonal depression has generated considerable clinical interest in recent years. Despite a common belief that people in higher latitudes are more vulnerable to low mood during the winter, it has never been demonstrated that human's moods are subject to seasonal change on a global scale. The aim of this study was to investigate large-scale seasonal patterns of depression using Internet search query data as a signature and proxy of human affect.
Our study was based on a publicly available search engine database, Google Insights for Search, which provides time series data of weekly search trends from January 1, 2004 to June 30, 2009. We applied an empirical mode decomposition method to isolate seasonal components of health-related search trends of depression in 54 geographic areas worldwide. We identified a seasonal trend of depression that was opposite between the northern and southern hemispheres; this trend was significantly correlated with seasonal oscillations of temperature (USA: r = −0.872, p<0.001; Australia: r = −0.656, p<0.001). Based on analyses of search trends over 54 geological locations worldwide, we found that the degree of correlation between searching for depression and temperature was latitude-dependent (northern hemisphere: r = −0.686; p<0.001; southern hemisphere: r = 0.871; p<0.0001).
Our findings indicate that Internet searches for depression from people in higher latitudes are more vulnerable to seasonal change, whereas this phenomenon is obscured in tropical areas. This phenomenon exists universally across countries, regardless of language. This study provides novel, Internet-based evidence for the epidemiology of seasonal depression.
The glucocorticoid receptor (GR) is a transcription factor that regulates gene expression in a ligand-dependent fashion. This modular protein is one of the major pharmacological targets due to its involvement in both cause and treatment of many human diseases. Intense efforts have been made to get information about the molecular basis of GR activity.
Here, the behavior of four GR-ligand complexes with different glucocorticoid and antiglucocorticoid properties were evaluated. The ability of GR-ligand complexes to oligomerize in vivo was analyzed by performing the novel Number and Brightness assay. Results showed that most of GR molecules form homodimers inside the nucleus upon ligand binding. Additionally, in vitro GR-DNA binding analyses suggest that ligand structure modulates GR-DNA interaction dynamics rather than the receptor's ability to bind DNA. On the other hand, by coimmunoprecipitation studies we evaluated the in vivo interaction between the transcriptional intermediary factor 2 (TIF2) coactivator and different GR-ligand complexes. No correlation was found between GR intranuclear distribution, cofactor recruitment and the homodimerization process. Finally, Molecular determinants that support the observed experimental GR LBD-ligand/TIF2 interaction were found by Molecular Dynamics simulation.
The data presented here sustain the idea that in vivo GR homodimerization inside the nucleus can be achieved in a DNA-independent fashion, without ruling out a dependent pathway as well. Moreover, since at least one GR-ligand complex is able to induce homodimer formation while preventing TIF2 coactivator interaction, results suggest that these two events might be independent from each other. Finally, 21-hydroxy-6,19-epoxyprogesterone arises as a selective glucocorticoid with potential pharmacological interest. Taking into account that GR homodimerization and cofactor recruitment are considered essential steps in the receptor activation pathway, results presented here contribute to understand how specific ligands influence GR behavior.
Targeting stem cells holds great potential for studying the embryonic stem cell and development of stem cell-based regenerative medicine. Previous studies demonstrated that nanoparticles can serve as a robust platform for gene delivery, non-invasive cell imaging, and manipulation of stem cell differentiation. However specific targeting of embryonic stem cells by peptide-linked nanoparticles has not been reported.
Here, we developed a method for screening peptides that specifically recognize rhesus macaque embryonic stem cells by phage display and used the peptides to facilitate quantum dot targeting of embryonic stem cells. Through a phage display screen, we found phages that displayed an APWHLSSQYSRT peptide showed high affinity and specificity to undifferentiated primate embryonic stem cells in an enzyme-linked immunoabsorbent assay. These results were subsequently confirmed by immunofluoresence microscopy. Additionally, this binding could be completed by the chemically synthesized APWHLSSQYSRT peptide, indicating that the binding capability was specific and conferred by the peptide sequence. Through the ligation of the peptide to CdSe-ZnS core-shell nanocrystals, we were able to, for the first time, target embryonic stem cells through peptide-conjugated quantum dots.
These data demonstrate that our established method of screening for embryonic stem cell specific binding peptides by phage display is feasible. Moreover, the peptide-conjugated quantum dots may be applicable for embryonic stem cell study and utilization.
Vascular endothelial growth factor C (VEGF-C) is a key mediator of lymphangiogenesis, acting via its receptors VEGF-R2 and VEGF-R3. High expression of VEGF-C in tumors correlates with increased lymphatic vessel density, lymphatic vessel invasion, sentinel lymph node metastasis and poor prognosis. Recently, we found that in a chemically induced skin carcinoma model, increased VEGF-C drainage from the tumor enhanced lymphangiogenesis in the sentinel lymph node and facilitated metastatic spread of cancer cells via the lymphatics. Hence, interference with the VEGF-C/VEGF-R3 axis holds promise to block metastatic spread, as recently shown by use of a neutralizing anti-VEGF-R3 antibody and a soluble VEGF-R3 (VEGF-C/D trap). By antibody phage-display, we have developed a human monoclonal antibody fragment (single-chain Fragment variable, scFv) that binds with high specificity and affinity to the fully processed mature form of human VEGF-C. The scFv binds to an epitope on VEGF-C that is important for receptor binding, since binding of the scFv to VEGF-C dose-dependently inhibits the binding of VEGF-C to VEGF-R2 and VEGF-R3 as shown by BIAcore and ELISA analyses. Interestingly, the variable heavy domain (VH) of the anti-VEGF-C scFv, which contains a mutation typical for camelid heavy chain-only antibodies, is sufficient for binding VEGF-C. This reduced the size of the potentially VEGF-C-blocking antibody fragment to only 14.6 kDa. Anti-VEGF-C VH-based immunoproteins hold promise to block the lymphangiogenic activity of VEGF-C, which would present a significant advance in inhibiting lymphatic-based metastatic spread of certain cancer types.
Digital networks, mobile devices, and the possibility of mining the ever-increasing amount of digital traces that we leave behind in our daily activities are changing the way we can approach the study of human and social interactions. Large-scale datasets, however, are mostly available for collective and statistical behaviors, at coarse granularities, while high-resolution data on person-to-person interactions are generally limited to relatively small groups of individuals. Here we present a scalable experimental framework for gathering real-time data resolving face-to-face social interactions with tunable spatial and temporal granularities.
Methods and Findings
We use active Radio Frequency Identification (RFID) devices that assess mutual proximity in a distributed fashion by exchanging low-power radio packets. We analyze the dynamics of person-to-person interaction networks obtained in three high-resolution experiments carried out at different orders of magnitude in community size. The data sets exhibit common statistical properties and lack of a characteristic time scale from 20 seconds to several hours. The association between the number of connections and their duration shows an interesting super-linear behavior, which indicates the possibility of defining super-connectors both in the number and intensity of connections.
Taking advantage of scalability and resolution, this experimental framework allows the monitoring of social interactions, uncovering similarities in the way individuals interact in different contexts, and identifying patterns of super-connector behavior in the community. These results could impact our understanding of all phenomena driven by face-to-face interactions, such as the spreading of transmissible infectious diseases and information.