To inquire about the connection anywhere between GC posts and recombination price i utilize several ways

(A) GC content variance around CO breakpoints (blue dots and line). The window 0 on the x-axis is the GC content of the breakpoints and the negative and positive values represent the distance away from the breakpoints. Each of these windows is defined as 2 kb sequence and the GC content is calculated for each window. The red dots and line are one of the GC content random samples simulated like the numbers of CO breakpoints (blue dot and line). After 10,000 repeats, not one of random samples is as extreme as the observed (blue line) (P <0.0001). (B) Relationship between recombination and GC content. When the chromosomes are dissected into 10 kb non-overlapping regions, recombination rate (cM/Mb) and GC content can be obtained for each of them. After the bins are sorted by the GC content, the windows are divided into 31 groups based on GC content (approximately 20% to 51%, 1% interval), and the average (and s.e.m.) recombination rates reported for each group.

In both we dissect the genome into 10 kb non-overlapping windows of which there are 19,297. First, we ask about the raw correlation between GC% and cM/Mb for these windows, which as expected is positive and significant (Spearman’s rho = 0.192; P <10 -15 ). Second, we wish to know the average effect of increasing one unit in either parameter on the other. Given the noise in the data (and given that current recombination rate need not imply the ancestral recombination rate) we approach this issue using a smoothing approach. We start by rank ordering all windows by GC content and then dividing them into blocks of 1% GC range, after excluding windows with more than 10% ‘N'. The resulting plot is highly skewed by bins with very high GC (55% to 58%) as these have very few data points (Additional file 1: Figure S10E) (the same outliers likely effect the raw correlation too). Removing these three results in a more consistent trend (Additional file 1: Figure S10F). This also suggests that below circa 20% GC the recombination rate is zero (Additional file 1: Figure S10F). Removing those with GC <20% and, more generally, any bins with fewer than 100 windows (all bins with GC < 20% have fewer than 100 windows) leaves 18,680 (96.8%) of the windows, these having a GC content between approximately 20% and 51%.

Relationships between recombination and you will GC-content

From the observation, i imagine one normally a-1 cm/Mb increase in recombination price is on the an increase in GC articles of around 0.5%. Conversely a 1% escalation in GC content corresponds to a more or less dos cM/Mb rise in recombination rates. I stop that because of the apparent rarity regarding NCO gene transformation, at the very least throughout the bee genome, extrapolation regarding GC content in order to average crossing-over rates thus appears to be justifiable, about to own GC articles over 20%. We note also one within significant GC content the fresh recombination price could be over otherwise underestimated. This may reflect an effective discordance ranging from newest and you may earlier in the day recombination rates.

Speaking of familiar with create Shape 4B, which gifts a somewhat appears-100 % free (just after smoothing) monotonic relationship among them parameters

Crossing-over rate is additionally for the nucleotide variety, gene thickness, and you will duplicate number version places (Profile S11-S13 from inside the More file step one) . Offered our very own removal of hetSNPs regarding analysis the latter result is perhaps not trivially a great CNV associated artifact. All of our great-level analyses inform you a positive relationship between nucleotide range and you may recombination rate at all this new bills out of ten, one hundred, 200, or five-hundred kb series screen (Shape S11 during the Even more file step one). That it bolsters earlier in the day analyses, among hence claimed the brand new trend but think it is is non-significant, if you’re several other reported a pattern anywhere between populace genetic estimates out of recombination and you will genetic variety. The fresh pattern accords toward understanding you to recombination reasons quicker Slope-Robertson interference therefore providing significantly lower rates regarding hitchhiking and you can record selection, thus providing higher diversity. We and additionally get a hold of a strong bad correlation anywhere between recombination and you may gene thickness (Shape S12 within the More document step one) and you can a strong confident correlation ranging from recombination in addition to period of multi-copy countries at various window systems (Profile S13 into the More document step one). Brand new correlation which have CNVs try in keeping with a job to own low-allelic recombination promoting duplications and deletions thru unequal crossing over .