% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/functions_simulations.R
\name{forceLocus}
\alias{forceLocus}
\title{Force the simulations to contain the required number of loci}
\usage{
forceLocus(
  model,
  parameters,
  nSites,
  nLoci,
  nDip,
  mutrate,
  mean,
  variance,
  minimum,
  maximum,
  size,
  min.minor
)
}
\arguments{
\item{model}{a character, either 2pops", "Single" or "Parallel" indicating
which model should be simulated.}

\item{parameters}{a vector of parameters used to create the command line for
the scrm package. Each entry of the vector is a different parameter. Note
that each vector entry should be named with the name of the corresponding
parameter. The output of the \code{CreateParameters} function is the intended
input.}

\item{nSites}{is an integer that specifies how many base pairs should scrm
simulate, i.e. how many sites per locus to simulate.}

\item{nLoci}{an integer that represents how many independent loci should be
simulated.}

\item{nDip}{an integer representing the total number of diploid individuals
to simulate. Note that scrm actually simulates haplotypes, so the number of
simulated haplotypes is double of this. Also note that this is the total
number of diploid individuals and this function will distribute the
individuals equally by the simulated populations.}

\item{mutrate}{an integer representing the mutation rate assumed for the
simulations.}

\item{mean}{an integer or a vector defining the mean value of the negative
binomial distribution from which different number of reads are drawn. It
represents the mean coverage across all sites. If a vector is supplied, the
function assumes that each entry of the vector is the mean for a different
population.}

\item{variance}{an integer or a vector defining the variance of the negative
binomial distribution from which different number of reads are drawn. It
represents the variance of the total coverage across all sites. If a vector
is supplied, the function assumes that each entry of the vector is the
variance for a different population.}

\item{minimum}{an integer representing the minimum coverage allowed. Sites
where any population has a depth of coverage below this threshold are
removed from the data.}

\item{maximum}{an integer representing the maximum coverage allowed. Sites
where any population has a depth of coverage above this threshold are
removed from the data.}

\item{size}{a list with one entry per population. Each entry should be a
vector containing the size (in number of diploid individuals) of each pool.
Thus, if a population was sequenced using a single pool, the vector should
contain only one entry. If a population was sequenced using two pools, each
with 10 individuals, this vector should contain two entries and both will
be 10.}

\item{min.minor}{is an integer representing the minimum allowed number of
minor-allele reads. Sites that, across all populations, have less
minor-allele reads than this threshold will be removed from the data.}
}
\value{
a list with two names entries

\item{pool}{a list with three different entries: major, minor and total.
This list is obtained by running the \code{\link{forcePool}} function.}

\item{nPoly}{a numeric value indicating the mean number of polymorphic
sites across all simulated locus.}
}
\description{
This function attempts to force the required number of loci after the
filtering steps are performed.
}
\details{
This is done by simulating extra loci for each of the different types of
simulations performed. The possible types of simulations include loci without
barriers against migration between divergent ecotypes, loci without migration
from the C towards the W ecotype, loci without migration from the W towards
the C ecotypes and loci where no migration occurs between divergent ecotypes.
Using this function, more loci than required are simulated for each of those
types of simulations.

Then, a coverage-based filter is applied to the data, followed by a filter
based on a required number of minor-allele reads per site. Those filters
remove some loci from the data. The extra simulated loci should allow us to
keep the required number of loci per type of simulation even after filtering.
}
\examples{
# create a vector with parameter values for a two populations model
params <- createParams(Nref = c(25000, 25000), ratio = c(0.1, 3), pool = c(5, 250),
seq = c(0.0001, 0.001), split = c(0, 3), CW = c(1e-13, 1e-3), WC = c(1e-13, 1e-3),
bT = c(0, 0.2), model = "2pops")

# simulate exactly 10 loci - using an isolation with migration model with two populations
forceLocus(model = "2pops", parameters = params, nSites = 1000, nLoci = 10, nDip = 100,
mutrate = 2e-8, mean = c(100, 100), variance = c(250, 250), minimum = 10, maximum = 200,
size = list(50, 50), min.minor = 0)

}
