Title: | Accessing the 'CHILDES' Database |
---|---|
Description: | Tools for connecting to 'CHILDES', an open repository for transcripts of parent-child interaction. For more information on the underlying data, see <https://langcog.github.io/childes-db-website/>. |
Authors: | Mika Braginsky [aut, cre], Alessandro Sanchez [aut, ctb], Daniel Yurovsky [aut], Kyle MacDonald [ctb], Stephan Meylan [ctb], Jessica Mankewitz [ctb] |
Maintainer: | Mika Braginsky <[email protected]> |
License: | GPL-3 |
Version: | 0.2.3 |
Built: | 2024-11-03 03:56:00 UTC |
Source: | https://github.com/langcog/childesr |
Clear all MySQL connections
clear_connections()
clear_connections()
Connect to CHILDES
connect_to_childes(db_version = "current", db_args = NULL)
connect_to_childes(db_version = "current", db_args = NULL)
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
con A DBIConnection object for the CHILDES database
## Not run: con <- connect_to_childes(db_version = "current", db_args = NULL) DBI::dbDisconnect(con) ## End(Not run)
## Not run: con <- connect_to_childes(db_version = "current", db_args = NULL) DBI::dbDisconnect(con) ## End(Not run)
Get collections
get_collections(connection = NULL, db_version = "current", db_args = NULL)
get_collections(connection = NULL, db_version = "current", db_args = NULL)
connection |
A connection to the CHILDES database |
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
A 'tbl' of Collection data. If 'connection' is supplied, the result remains a remote query, otherwise it is retrieved into a local tibble.
## Not run: get_collections() ## End(Not run)
## Not run: get_collections() ## End(Not run)
Get content
get_content( content_type, collection = NULL, language = NULL, corpus = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, target_child = NULL, token = NULL, stem = NULL, part_of_speech = NULL, connection )
get_content( content_type, collection = NULL, language = NULL, corpus = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, target_child = NULL, token = NULL, stem = NULL, part_of_speech = NULL, connection )
content_type |
One of "token" or "utterance" |
collection |
A character vector of one or more names of collections |
language |
A character vector of one or more languages |
corpus |
A character vector of one or more names of corpora |
role |
A character vector of one or more roles to include |
role_exclude |
A character vector of one or more roles to exclude |
age |
A numeric vector of an single age value or a min age value and max age value (inclusive) in months. For a single age value, participants are returned for which that age is within their age range; for two ages, participants are returned for whose age overlaps with the interval between those two ages. |
sex |
A character vector of values "male" and/or "female" |
target_child |
A character vector of one or more names of children |
token |
A character vector of one or more token patterns ('%' matches any number of wildcard characters, '_' matches exactly one wildcard character) |
stem |
A character vector of one or more stems |
part_of_speech |
A character vector of one or more parts of speech |
connection |
A connection to the CHILDES database |
Get the utterances surrounding a token(s)
get_contexts( collection = NULL, language = NULL, corpus = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, target_child = NULL, token, window = c(0, 0), remove_duplicates = TRUE, connection = NULL, db_version = "current", db_args = NULL )
get_contexts( collection = NULL, language = NULL, corpus = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, target_child = NULL, token, window = c(0, 0), remove_duplicates = TRUE, connection = NULL, db_version = "current", db_args = NULL )
collection |
A character vector of one or more names of collections |
language |
A character vector of one or more languages |
corpus |
A character vector of one or more names of corpora |
role |
A character vector of one or more roles to include |
role_exclude |
A character vector of one or more roles to exclude |
age |
A numeric vector of an single age value or a min age value and max age value (inclusive) in months. For a single age value, participants are returned for which that age is within their age range; for two ages, participants are returned for whose age overlaps with the interval between those two ages. |
sex |
A character vector of values "male" and/or "female" |
target_child |
A character vector of one or more names of children |
token |
A character vector of one or more token patterns ('%' matches any number of wildcard characters, '_' matches exactly one wildcard character) |
window |
A length 2 numeric vector of how many utterances before and after each utterance containing the target token to retrieve |
remove_duplicates |
A boolean indicating whether to remove duplicate utterances from the results |
connection |
A connection to the CHILDES database |
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
A 'tbl' of Utterance data, filtered down by supplied arguments.
## Not run: get_contexts(target_child = "Shem", token = "dog") ## End(Not run)
## Not run: get_contexts(target_child = "Shem", token = "dog") ## End(Not run)
Get corpora
get_corpora(connection = NULL, db_version = "current", db_args = NULL)
get_corpora(connection = NULL, db_version = "current", db_args = NULL)
connection |
A connection to the CHILDES database |
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
A 'tbl' of Corpus data. If 'connection' is supplied, the result remains a remote query, otherwise it is retrieved into a local tibble.
## Not run: get_corpora() ## End(Not run)
## Not run: get_corpora() ## End(Not run)
Get information on database connection options
get_db_info()
get_db_info()
List of database info: host name, current version, supported versions, historical versions, username, password
## Not run: get_db_info() ## End(Not run)
## Not run: get_db_info() ## End(Not run)
Get participants
get_participants( collection = NULL, corpus = NULL, target_child = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, connection = NULL, db_version = "current", db_args = NULL )
get_participants( collection = NULL, corpus = NULL, target_child = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, connection = NULL, db_version = "current", db_args = NULL )
collection |
A character vector of one or more names of collections |
corpus |
A character vector of one or more names of corpora |
target_child |
A character vector of one or more names of children |
role |
A character vector of one or more roles to include |
role_exclude |
A character vector of one or more roles to exclude |
age |
A numeric vector of an single age value or a min age value and max age value (inclusive) in months. For a single age value, participants are returned for which that age is within their age range; for two ages, participants are returned for whose age overlaps with the interval between those two ages. |
sex |
A character vector of values "male" and/or "female" |
connection |
A connection to the CHILDES database |
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
A 'tbl' of Participant data, filtered down by supplied arguments. If 'connection' is supplied, the result remains a remote query, otherwise it is retrieved into a local tibble.
## Not run: get_participants() ## End(Not run)
## Not run: get_participants() ## End(Not run)
Get speaker statistics
get_speaker_statistics( collection = NULL, corpus = NULL, target_child = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, connection = NULL, db_version = "current", db_args = NULL )
get_speaker_statistics( collection = NULL, corpus = NULL, target_child = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, connection = NULL, db_version = "current", db_args = NULL )
collection |
A character vector of one or more names of collections |
corpus |
A character vector of one or more names of corpora |
target_child |
A character vector of one or more names of children |
role |
A character vector of one or more roles to include |
role_exclude |
A character vector of one or more roles to exclude |
age |
A numeric vector of an single age value or a min age value and max age value (inclusive) in months. For a single age value, participants are returned for which that age is within their age range; for two ages, participants are returned for whose age overlaps with the interval between those two ages. |
sex |
A character vector of values "male" and/or "female" |
connection |
A connection to the CHILDES database |
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
A 'tbl' of Speaker statistics, filtered down by supplied arguments. If 'connection' is supplied, the result remains a remote query, otherwise it is retrieved into a local tibble.
## Not run: get_speaker_statistics() ## End(Not run)
## Not run: get_speaker_statistics() ## End(Not run)
Run a SQL Query script on the CHILDES database
get_sql_query( sql_query_string, connection = NULL, db_version = "current", db_args = NULL )
get_sql_query( sql_query_string, connection = NULL, db_version = "current", db_args = NULL )
sql_query_string |
A valid sql query string character |
connection |
A connection to the CHILDES database |
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
The database after calling the supplied SQL query
## Not run: get_sql_query("SELECT * FROM collection") ## End(Not run)
## Not run: get_sql_query("SELECT * FROM collection") ## End(Not run)
Get table
get_table(connection, name)
get_table(connection, name)
connection |
A connection to the CHILDES database |
name |
String of a table name |
A 'tbl'
Get tokens
get_tokens( collection = NULL, language = NULL, corpus = NULL, target_child = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, token, stem = NULL, part_of_speech = NULL, replace = TRUE, connection = NULL, db_version = "current", db_args = NULL )
get_tokens( collection = NULL, language = NULL, corpus = NULL, target_child = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, token, stem = NULL, part_of_speech = NULL, replace = TRUE, connection = NULL, db_version = "current", db_args = NULL )
collection |
A character vector of one or more names of collections |
language |
A character vector of one or more languages |
corpus |
A character vector of one or more names of corpora |
target_child |
A character vector of one or more names of children |
role |
A character vector of one or more roles to include |
role_exclude |
A character vector of one or more roles to exclude |
age |
A numeric vector of an single age value or a min age value and max age value (inclusive) in months. For a single age value, participants are returned for which that age is within their age range; for two ages, participants are returned for whose age overlaps with the interval between those two ages. |
sex |
A character vector of values "male" and/or "female" |
token |
A character vector of one or more token patterns ('%' matches any number of wildcard characters, '_' matches exactly one wildcard character) |
stem |
A character vector of one or more stems |
part_of_speech |
A character vector of one or more parts of speech |
replace |
A boolean indicating whether to replace "gloss" with
"replacement" (i.e. phonologically assimilated form), when available
(defaults to |
connection |
A connection to the CHILDES database |
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
A 'tbl' of Token data, filtered down by supplied arguments. If 'connection' is supplied, the result remains a remote query, otherwise it is retrieved into a local tibble.
## Not run: get_tokens(token = "dog") ## End(Not run)
## Not run: get_tokens(token = "dog") ## End(Not run)
Get transcripts
get_transcripts( collection = NULL, corpus = NULL, target_child = NULL, connection = NULL, db_version = "current", db_args = NULL )
get_transcripts( collection = NULL, corpus = NULL, target_child = NULL, connection = NULL, db_version = "current", db_args = NULL )
collection |
A character vector of one or more names of collections |
corpus |
A character vector of one or more names of corpora |
target_child |
A character vector of one or more names of children |
connection |
A connection to the CHILDES database |
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
A 'tbl' of Transcript data, filtered down by supplied arguments. If 'connection' is supplied, the result remains a remote query, otherwise it is retrieved into a local tibble.
## Not run: get_transcripts() ## End(Not run)
## Not run: get_transcripts() ## End(Not run)
Get types
get_types( collection = NULL, language = NULL, corpus = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, target_child = NULL, type = NULL, connection = NULL, db_version = "current", db_args = NULL )
get_types( collection = NULL, language = NULL, corpus = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, target_child = NULL, type = NULL, connection = NULL, db_version = "current", db_args = NULL )
collection |
A character vector of one or more names of collections |
language |
A character vector of one or more languages |
corpus |
A character vector of one or more names of corpora |
role |
A character vector of one or more roles to include |
role_exclude |
A character vector of one or more roles to exclude |
age |
A numeric vector of an single age value or a min age value and max age value (inclusive) in months. For a single age value, participants are returned for which that age is within their age range; for two ages, participants are returned for whose age overlaps with the interval between those two ages. |
sex |
A character vector of values "male" and/or "female" |
target_child |
A character vector of one or more names of children |
type |
A character vector of one or more type patterns (' number of wildcard characters, '_' matches exactly one wildcard character) |
connection |
A connection to the CHILDES database |
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
A 'tbl' of Type data, filtered down by supplied arguments. If 'connection' is supplied, the result remains a remote query, otherwise it is retrieved into a local tibble.
## Not run: get_types() ## End(Not run)
## Not run: get_types() ## End(Not run)
Get utterances
get_utterances( collection = NULL, language = NULL, corpus = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, target_child = NULL, connection = NULL, db_version = "current", db_args = NULL )
get_utterances( collection = NULL, language = NULL, corpus = NULL, role = NULL, role_exclude = NULL, age = NULL, sex = NULL, target_child = NULL, connection = NULL, db_version = "current", db_args = NULL )
collection |
A character vector of one or more names of collections |
language |
A character vector of one or more languages |
corpus |
A character vector of one or more names of corpora |
role |
A character vector of one or more roles to include |
role_exclude |
A character vector of one or more roles to exclude |
age |
A numeric vector of an single age value or a min age value and max age value (inclusive) in months. For a single age value, participants are returned for which that age is within their age range; for two ages, participants are returned for whose age overlaps with the interval between those two ages. |
sex |
A character vector of values "male" and/or "female" |
target_child |
A character vector of one or more names of children |
connection |
A connection to the CHILDES database |
db_version |
String of the name of database version to use |
db_args |
List with host, user, and password defined |
A 'tbl' of Utterance data, filtered down by supplied arguments. If 'connection' is supplied, the result remains a remote query, otherwise it is retrieved into a local tibble.
## Not run: get_utterances(target_child = "Shem") ## End(Not run)
## Not run: get_utterances(target_child = "Shem") ## End(Not run)