mirror of
https://github.com/rehlds/rehlds.git
synced 2025-04-22 14:23:36 +03:00
Typos
parent
7ece9e2db7
commit
6681490988
@ -1,15 +1,15 @@
|
||||
May 2015 was the performance optimization month for ReHLDS project: hundreds profiler runs and thousands lines of code changes led to over 2x performance boost. In this article I’m going to share performance test results, but before that, I’ll dive into technical background and tell you about the Rehlds demo recorder and player, the feature that allows testing of ReHLDS code and making benchmarks
|
||||
May 2015 was the performance optimization month for ReHLDS project: hundreds profiler runs and thousands lines of code changes led to over 2x performance boost. In this article I’m going to share performance test results, but before that, I’ll dive into technical background and tell you about the Rehlds demo recorder and player, the feature that allows testing of ReHLDS code and making benchmarks.
|
||||
|
||||
# ReHLDS demo recorder/player
|
||||
ReHLDS demo recorder & player are parts of ReHLDS test suite which, in a nutshell, is the ‘black box’ testing appliance. To understand how it works we should treat ReHLDS as the black box which consumes data from external services, does some processing and sends data back. Services are well-known APIs: Win32 API, standard C library, Steam API.
|
||||
|
||||
<img src="http://dreamstalker.github.io/rehlds/images/wiki_may2015/demo_rp_1.png" width="500"></img>
|
||||
|
||||
Before we can do a ‘block box’ testing we should intercept the data flow between ReHLDS and external services and write it to some file which is called ‘ReHLDS test demo’:
|
||||
Before we can do a ‘black box’ testing we should intercept the data flow between ReHLDS and external services and write it to some file which is called ‘ReHLDS test demo’:
|
||||
|
||||
<img src="http://dreamstalker.github.io/rehlds/images/wiki_may2015/demo_rp_2.png" width="650"></img>
|
||||
|
||||
Now we can run ReHLDS in test mode and feed data from file we recorded on previous step. We should also make sure that ReHLDS produces the same output as it produced during recording:
|
||||
Now we can run ReHLDS in test mode and feed data from file we recorded on previous step. We should also make sure that ReHLDS produces the same output as it produced by (Re)HLDS during test demo recording:
|
||||
|
||||
<img src="http://dreamstalker.github.io/rehlds/images/wiki_may2015/demo_rp_3.png" width="650"></img>
|
||||
|
||||
@ -23,7 +23,7 @@ For benchmarking purposes 9 ReHLDS demos were recorded:
|
||||
- 3 in optimized engine and stock gamedll
|
||||
- 3 in optimized engine and optimized gamedll
|
||||
|
||||
“Optimized” means optimizations that break binary outgoing dataflow compatibility with stock versions of gamedll/engine
|
||||
“Optimized” means optimizations that break binary outgoing dataflow compatibility with stock versions of gamedll/engine.
|
||||
|
||||
Demos were recorded in following environment:
|
||||
- 32 bots (controlled by FakePlayer v1.11) playing on de_aztec
|
||||
@ -42,14 +42,14 @@ Benchmarking session consists of playing demos on each of the following environm
|
||||
| Optimized | Stock | Optimized |
|
||||
| Optimized | Optimized | Optimized |
|
||||
|
||||
Now let's go through configuration elements
|
||||
Now let's go through configuration elements:
|
||||
- Engine's pedantic optimizations are optimizations that don’t break binary outgoing dataflow compatibility with stock version of the engine
|
||||
- Engine's optimizations consist of pedatic optimizations plus some algorithm changes and use of SSE in several functions
|
||||
- Metamod's optimizations consist of bypassing interceptors for following functions: AddToFullPack, ModelIndex, IndexOfEdict, CheckVisibility, GetCurrentPlayer, DeltaUnsetFieldByIndex. This means that metamod plugins are not able to intercept calls to these functions
|
||||
- GameDLL's optimization is AngleQuaternion function rewritten using SSE instructions
|
||||
|
||||
# Benchmark results
|
||||
Demos in each configuration were played 3 times, average duration was used as a result. 6 different systems were used to run benchmark. Raw result are available <a href="http://dreamstalker.github.io/rehlds/images/wiki_may2015/benchmark_results_raw.csv">here</a>
|
||||
Demos in each configuration were played 3 times, average duration was used as a result. 6 different systems were used to run benchmark. Raw result are available <a href="http://dreamstalker.github.io/rehlds/images/wiki_may2015/benchmark_results_raw.csv">here</a>.
|
||||
|
||||
To visualize raw results we should do two things:
|
||||
<ol>
|
||||
@ -71,9 +71,9 @@ Charts for each CPU:
|
||||
<img src="http://dreamstalker.github.io/rehlds/images/wiki_may2015/res_graph_i7-3770.png"></img>
|
||||
|
||||
# Analysis
|
||||
It is clearly seen that fully optimized ReHLDS (E:Opt, G:Opt M:Opt) configuration is much faster (2.5 to 3 times) than stock configuration on all CPUs
|
||||
It is clearly seen that fully optimized ReHLDS (E:Opt, G:Opt M:Opt) configuration is much faster (2.5 to 3 times) than stock configuration on all CPUs.
|
||||
|
||||
Now we’ll go through each configuration component and examine its impact on performance
|
||||
Now we’ll go through each configuration component and examine its impact on performance.
|
||||
|
||||
#### Metamod: stock vs optimized
|
||||
Bypassing the plugins invocation on 6 functions (which are hooked very rarely) gives 20% to 30% performance gain.
|
||||
@ -82,10 +82,10 @@ Bypassing the plugins invocation on 6 functions (which are hooked very rarely) g
|
||||
A pack of ReHLDS optimizations gives 65% to 110% (usually around 90%) performance gain.
|
||||
|
||||
#### Engine: ReHLDS w. pedantic opt vs ReHLDS with all optimizations
|
||||
Use of SSE instead of FPU in several places gives 11% performance gain
|
||||
Use of SSE instead of FPU in several places gives 11% performance gain.
|
||||
|
||||
#### Engine: GameDLL: stock vs optimized
|
||||
One function (AngleQuaternion) rewritten using SSE gives 6% performance gain
|
||||
One function (AngleQuaternion) rewritten using SSE gives 6% performance gain.
|
||||
|
||||
# Conclusion
|
||||
I don't know what to say, actually, since the numbers speak for themselves
|
||||
I don't know what to say, actually, since the numbers speak for themselves.
|
Loading…
x
Reference in New Issue
Block a user